Overview

Brought to you by YData

Dataset statistics

Number of variables92
Number of observations455240
Missing cells25245484
Missing cells (%)60.3%
Total size in memory319.5 MiB
Average record size in memory736.0 B

Variable types

Text92

Dataset

DescriptionFish NMNH Extant Specimen Records 0055081-241126133413365
URLhttps://doi.org/10.15468/dl.34mb2x

Alerts

institutionID has constant value "urn:lsid:biocol.org:col:34871" Constant
collectionID has constant value "urn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f" Constant
institutionCode has constant value "USNM" Constant
collectionCode has constant value "FISH" Constant
datasetName has constant value "NMNH Extant Biology" Constant
sex has constant value "male" Constant
organismID has constant value "68° 53' 56.61" W" Constant
eventTime has constant value "941" Constant
eventRemarks has constant value "Guide to Best Practices for Georeferencing. (Chapman and Wieczorek, eds. 2006). Google Earth Pro" Constant
locationRemarks has constant value "Williams, Jeffrey T." Constant
coordinatePrecision has constant value "Perciformes" Constant
footprintSRS has constant value "271" Constant
georeferencedDate has constant value "9" Constant
latestEonOrHighestEonothem has constant value "Animalia" Constant
earliestEraOrLowestErathem has constant value "Chordata" Constant
latestEraOrHighestErathem has constant value "Actinopterygii" Constant
latestEpochOrHighestSeries has constant value "Indian Ocean, Red Sea, Sudan, Red Sea State" Constant
earliestAgeOrLowestStage has constant value "Indian Ocean" Constant
latestAgeOrHighestStage has constant value "Indian Ocean, Red Sea" Constant
lithostratigraphicTerms has constant value "Sudan" Constant
identificationID has constant value "Mersa Towartit, Sudan" Constant
phylum has constant value "Chordata" Constant
taxonRank has constant value "subspecies" Constant
recordNumber has 434413 (95.4%) missing values Missing
recordedBy has 287329 (63.1%) missing values Missing
sex has 455237 (> 99.9%) missing values Missing
preparations has 346202 (76.0%) missing values Missing
associatedMedia has 361191 (79.3%) missing values Missing
associatedSequences has 454790 (99.9%) missing values Missing
occurrenceRemarks has 290501 (63.8%) missing values Missing
organismID has 455238 (> 99.9%) missing values Missing
organismRemarks has 455237 (> 99.9%) missing values Missing
materialEntityID has 455237 (> 99.9%) missing values Missing
eventType has 455237 (> 99.9%) missing values Missing
fieldNumber has 274230 (60.2%) missing values Missing
eventDate has 57339 (12.6%) missing values Missing
eventTime has 455239 (> 99.9%) missing values Missing
startDayOfYear has 78261 (17.2%) missing values Missing
endDayOfYear has 78054 (17.1%) missing values Missing
year has 57340 (12.6%) missing values Missing
month has 77299 (17.0%) missing values Missing
day has 91020 (20.0%) missing values Missing
verbatimEventDate has 92476 (20.3%) missing values Missing
eventRemarks has 455239 (> 99.9%) missing values Missing
locationID has 352031 (77.3%) missing values Missing
higherGeography has 20496 (4.5%) missing values Missing
continent has 23626 (5.2%) missing values Missing
waterBody has 133285 (29.3%) missing values Missing
islandGroup has 390836 (85.9%) missing values Missing
island has 270616 (59.4%) missing values Missing
country has 35901 (7.9%) missing values Missing
stateProvince has 174317 (38.3%) missing values Missing
county has 357556 (78.5%) missing values Missing
locality has 45088 (9.9%) missing values Missing
verbatimElevation has 453036 (99.5%) missing values Missing
minimumDepthInMeters has 249167 (54.7%) missing values Missing
maximumDepthInMeters has 263906 (58.0%) missing values Missing
verbatimDepth has 446663 (98.1%) missing values Missing
locationRemarks has 455239 (> 99.9%) missing values Missing
decimalLatitude has 254361 (55.9%) missing values Missing
decimalLongitude has 254361 (55.9%) missing values Missing
geodeticDatum has 448060 (98.4%) missing values Missing
coordinateUncertaintyInMeters has 450085 (98.9%) missing values Missing
coordinatePrecision has 455238 (> 99.9%) missing values Missing
verbatimCoordinates has 455238 (> 99.9%) missing values Missing
verbatimLatitude has 275360 (60.5%) missing values Missing
verbatimLongitude has 275421 (60.5%) missing values Missing
verbatimCoordinateSystem has 308955 (67.9%) missing values Missing
verbatimSRS has 455237 (> 99.9%) missing values Missing
footprintSRS has 455239 (> 99.9%) missing values Missing
footprintSpatialFit has 455232 (> 99.9%) missing values Missing
georeferencedBy has 455238 (> 99.9%) missing values Missing
georeferencedDate has 455239 (> 99.9%) missing values Missing
georeferenceProtocol has 437858 (96.2%) missing values Missing
georeferenceRemarks has 432224 (94.9%) missing values Missing
earliestEonOrLowestEonothem has 455233 (> 99.9%) missing values Missing
latestEonOrHighestEonothem has 455233 (> 99.9%) missing values Missing
earliestEraOrLowestErathem has 455233 (> 99.9%) missing values Missing
latestEraOrHighestErathem has 455233 (> 99.9%) missing values Missing
earliestPeriodOrLowestSystem has 455233 (> 99.9%) missing values Missing
earliestEpochOrLowestSeries has 455233 (> 99.9%) missing values Missing
latestEpochOrHighestSeries has 455239 (> 99.9%) missing values Missing
earliestAgeOrLowestStage has 455239 (> 99.9%) missing values Missing
latestAgeOrHighestStage has 455239 (> 99.9%) missing values Missing
lowestBiostratigraphicZone has 455233 (> 99.9%) missing values Missing
lithostratigraphicTerms has 455239 (> 99.9%) missing values Missing
formation has 455232 (> 99.9%) missing values Missing
identificationID has 455239 (> 99.9%) missing values Missing
identificationQualifier has 453543 (99.6%) missing values Missing
typeStatus has 435739 (95.7%) missing values Missing
identifiedBy has 421099 (92.5%) missing values Missing
genus has 23303 (5.1%) missing values Missing
subgenus has 454953 (99.9%) missing values Missing
specificEpithet has 67324 (14.8%) missing values Missing
infraspecificEpithet has 445321 (97.8%) missing values Missing
taxonRank has 445321 (97.8%) missing values Missing
scientificNameAuthorship has 416165 (91.4%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-14 16:50:44.995552
Analysis finished2025-01-14 16:51:01.098151
Duration16.1 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct455240
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-14T11:51:01.482434image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters4552400
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique455240 ?
Unique (%)100.0%

Sample

1st row1317202656
2nd row1317202715
3rd row1322535976
4th row1317203467
5th row2235732924
ValueCountFrequency (%)
1317202656 1
 
< 0.1%
1317216362 1
 
< 0.1%
3758717622 1
 
< 0.1%
1317206835 1
 
< 0.1%
1322539466 1
 
< 0.1%
2235733055 1
 
< 0.1%
1322541352 1
 
< 0.1%
1843575433 1
 
< 0.1%
1843575436 1
 
< 0.1%
1322545228 1
 
< 0.1%
Other values (455230) 455230
> 99.9%
2025-01-14T11:51:02.033113image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 954825
21.0%
3 714977
15.7%
2 572018
12.6%
8 367615
 
8.1%
0 348591
 
7.7%
9 346673
 
7.6%
7 345283
 
7.6%
4 314241
 
6.9%
5 302170
 
6.6%
6 286007
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4552400
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 954825
21.0%
3 714977
15.7%
2 572018
12.6%
8 367615
 
8.1%
0 348591
 
7.7%
9 346673
 
7.6%
7 345283
 
7.6%
4 314241
 
6.9%
5 302170
 
6.6%
6 286007
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Common 4552400
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 954825
21.0%
3 714977
15.7%
2 572018
12.6%
8 367615
 
8.1%
0 348591
 
7.7%
9 346673
 
7.6%
7 345283
 
7.6%
4 314241
 
6.9%
5 302170
 
6.6%
6 286007
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4552400
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 954825
21.0%
3 714977
15.7%
2 572018
12.6%
8 367615
 
8.1%
0 348591
 
7.7%
9 346673
 
7.6%
7 345283
 
7.6%
4 314241
 
6.9%
5 302170
 
6.6%
6 286007
 
6.3%
Distinct55510
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-14T11:51:02.249507image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters8649560
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31274 ?
Unique (%)6.9%

Sample

1st row2023-06-02 12:34:00
2nd row2019-11-27 11:21:00
3rd row2018-02-21 11:18:00
4th row2020-03-23 11:52:00
5th row2019-07-18 12:15:00
ValueCountFrequency (%)
2014-08-20 47469
 
5.2%
2014-08-25 38468
 
4.2%
2014-08-26 18583
 
2.0%
2019-07-18 14217
 
1.6%
2020-02-03 9198
 
1.0%
2017-12-18 6844
 
0.8%
2017-03-27 6177
 
0.7%
2017-05-23 6028
 
0.7%
2014-08-19 5504
 
0.6%
2018-07-27 4620
 
0.5%
Other values (3686) 753372
82.7%
2025-01-14T11:51:02.519545image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2223367
25.7%
1 1214317
14.0%
2 1142547
13.2%
- 910480
10.5%
: 910480
10.5%
455240
 
5.3%
4 380622
 
4.4%
8 314132
 
3.6%
3 305446
 
3.5%
5 267568
 
3.1%
Other values (3) 525361
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6373360
73.7%
Dash Punctuation 910480
 
10.5%
Other Punctuation 910480
 
10.5%
Space Separator 455240
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2223367
34.9%
1 1214317
19.1%
2 1142547
17.9%
4 380622
 
6.0%
8 314132
 
4.9%
3 305446
 
4.8%
5 267568
 
4.2%
7 199874
 
3.1%
9 176659
 
2.8%
6 148828
 
2.3%
Dash Punctuation
ValueCountFrequency (%)
- 910480
100.0%
Other Punctuation
ValueCountFrequency (%)
: 910480
100.0%
Space Separator
ValueCountFrequency (%)
455240
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8649560
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2223367
25.7%
1 1214317
14.0%
2 1142547
13.2%
- 910480
10.5%
: 910480
10.5%
455240
 
5.3%
4 380622
 
4.4%
8 314132
 
3.6%
3 305446
 
3.5%
5 267568
 
3.1%
Other values (3) 525361
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8649560
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2223367
25.7%
1 1214317
14.0%
2 1142547
13.2%
- 910480
10.5%
: 910480
10.5%
455240
 
5.3%
4 380622
 
4.4%
8 314132
 
3.6%
3 305446
 
3.5%
5 267568
 
3.1%
Other values (3) 525361
 
6.1%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-14T11:51:02.586539image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters13201960
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 455240
100.0%
2025-01-14T11:51:02.689986image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 1820960
13.8%
: 1820960
13.8%
l 1365720
 
10.3%
i 910480
 
6.9%
r 910480
 
6.9%
c 910480
 
6.9%
g 455240
 
3.4%
7 455240
 
3.4%
8 455240
 
3.4%
4 455240
 
3.4%
Other values (8) 3641920
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8649560
65.5%
Other Punctuation 2276200
 
17.2%
Decimal Number 2276200
 
17.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1820960
21.1%
l 1365720
15.8%
i 910480
10.5%
r 910480
10.5%
c 910480
10.5%
g 455240
 
5.3%
u 455240
 
5.3%
b 455240
 
5.3%
d 455240
 
5.3%
s 455240
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 455240
20.0%
8 455240
20.0%
4 455240
20.0%
3 455240
20.0%
1 455240
20.0%
Other Punctuation
ValueCountFrequency (%)
: 1820960
80.0%
. 455240
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8649560
65.5%
Common 4552400
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 1820960
21.1%
l 1365720
15.8%
i 910480
10.5%
r 910480
10.5%
c 910480
10.5%
g 455240
 
5.3%
u 455240
 
5.3%
b 455240
 
5.3%
d 455240
 
5.3%
s 455240
 
5.3%
Common
ValueCountFrequency (%)
: 1820960
40.0%
7 455240
 
10.0%
8 455240
 
10.0%
4 455240
 
10.0%
3 455240
 
10.0%
. 455240
 
10.0%
1 455240
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13201960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 1820960
13.8%
: 1820960
13.8%
l 1365720
 
10.3%
i 910480
 
6.9%
r 910480
 
6.9%
c 910480
 
6.9%
g 455240
 
3.4%
7 455240
 
3.4%
8 455240
 
3.4%
4 455240
 
3.4%
Other values (8) 3641920
27.6%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-14T11:51:02.746369image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters20485800
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f
2nd rowurn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f
3rd rowurn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f
4th rowurn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f
5th rowurn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f
ValueCountFrequency (%)
urn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f 455240
100.0%
2025-01-14T11:51:02.856464image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 2731440
13.3%
f 2276200
11.1%
9 1820960
8.9%
- 1820960
8.9%
d 1820960
8.9%
b 1365720
 
6.7%
5 1365720
 
6.7%
u 1365720
 
6.7%
: 910480
 
4.4%
3 910480
 
4.4%
Other values (8) 4097160
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10925760
53.3%
Decimal Number 6828600
33.3%
Dash Punctuation 1820960
 
8.9%
Other Punctuation 910480
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 2731440
25.0%
f 2276200
20.8%
d 1820960
16.7%
b 1365720
12.5%
u 1365720
12.5%
r 455240
 
4.2%
i 455240
 
4.2%
n 455240
 
4.2%
Decimal Number
ValueCountFrequency (%)
9 1820960
26.7%
5 1365720
20.0%
3 910480
13.3%
8 910480
13.3%
0 455240
 
6.7%
4 455240
 
6.7%
6 455240
 
6.7%
1 455240
 
6.7%
Dash Punctuation
ValueCountFrequency (%)
- 1820960
100.0%
Other Punctuation
ValueCountFrequency (%)
: 910480
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10925760
53.3%
Common 9560040
46.7%

Most frequent character per script

Common
ValueCountFrequency (%)
9 1820960
19.0%
- 1820960
19.0%
5 1365720
14.3%
: 910480
9.5%
3 910480
9.5%
8 910480
9.5%
0 455240
 
4.8%
4 455240
 
4.8%
6 455240
 
4.8%
1 455240
 
4.8%
Latin
ValueCountFrequency (%)
c 2731440
25.0%
f 2276200
20.8%
d 1820960
16.7%
b 1365720
12.5%
u 1365720
12.5%
r 455240
 
4.2%
i 455240
 
4.2%
n 455240
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20485800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 2731440
13.3%
f 2276200
11.1%
9 1820960
8.9%
- 1820960
8.9%
d 1820960
8.9%
b 1365720
 
6.7%
5 1365720
 
6.7%
u 1365720
 
6.7%
: 910480
 
4.4%
3 910480
 
4.4%
Other values (8) 4097160
20.0%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-14T11:51:02.897496image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1820960
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 455240
100.0%
2025-01-14T11:51:02.988701image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 455240
25.0%
S 455240
25.0%
N 455240
25.0%
M 455240
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1820960
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 455240
25.0%
S 455240
25.0%
N 455240
25.0%
M 455240
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1820960
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 455240
25.0%
S 455240
25.0%
N 455240
25.0%
M 455240
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1820960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 455240
25.0%
S 455240
25.0%
N 455240
25.0%
M 455240
25.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-14T11:51:03.032220image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1820960
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFISH
2nd rowFISH
3rd rowFISH
4th rowFISH
5th rowFISH
ValueCountFrequency (%)
fish 455240
100.0%
2025-01-14T11:51:03.121817image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
F 455240
25.0%
I 455240
25.0%
S 455240
25.0%
H 455240
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1820960
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 455240
25.0%
I 455240
25.0%
S 455240
25.0%
H 455240
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1820960
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 455240
25.0%
I 455240
25.0%
S 455240
25.0%
H 455240
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1820960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 455240
25.0%
I 455240
25.0%
S 455240
25.0%
H 455240
25.0%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-14T11:51:03.167299image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters8649560
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 455240
33.3%
extant 455240
33.3%
biology 455240
33.3%
2025-01-14T11:51:03.266532image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 910480
 
10.5%
910480
 
10.5%
t 910480
 
10.5%
o 910480
 
10.5%
M 455240
 
5.3%
H 455240
 
5.3%
E 455240
 
5.3%
x 455240
 
5.3%
a 455240
 
5.3%
n 455240
 
5.3%
Other values (5) 2276200
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5007640
57.9%
Uppercase Letter 2731440
31.6%
Space Separator 910480
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 910480
18.2%
o 910480
18.2%
x 455240
9.1%
a 455240
9.1%
n 455240
9.1%
i 455240
9.1%
l 455240
9.1%
g 455240
9.1%
y 455240
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 910480
33.3%
M 455240
16.7%
H 455240
16.7%
E 455240
16.7%
B 455240
16.7%
Space Separator
ValueCountFrequency (%)
910480
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7739080
89.5%
Common 910480
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 910480
11.8%
t 910480
11.8%
o 910480
11.8%
M 455240
 
5.9%
H 455240
 
5.9%
E 455240
 
5.9%
x 455240
 
5.9%
a 455240
 
5.9%
n 455240
 
5.9%
B 455240
 
5.9%
Other values (4) 1820960
23.5%
Common
ValueCountFrequency (%)
910480
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8649560
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 910480
 
10.5%
910480
 
10.5%
t 910480
 
10.5%
o 910480
 
10.5%
M 455240
 
5.3%
H 455240
 
5.3%
E 455240
 
5.3%
x 455240
 
5.3%
a 455240
 
5.3%
n 455240
 
5.3%
Other values (5) 2276200
26.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-14T11:51:03.317562image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length17.08130876
Min length17

Characters and Unicode

Total characters7776095
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowMachineObservation
ValueCountFrequency (%)
preservedspecimen 418225
91.9%
machineobservation 37015
 
8.1%
2025-01-14T11:51:03.428239image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2165155
27.8%
r 873465
11.2%
i 492255
 
6.3%
n 492255
 
6.3%
c 455240
 
5.9%
s 455240
 
5.9%
v 455240
 
5.9%
m 418225
 
5.4%
P 418225
 
5.4%
p 418225
 
5.4%
Other values (9) 1132570
14.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6865615
88.3%
Uppercase Letter 910480
 
11.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2165155
31.5%
r 873465
12.7%
i 492255
 
7.2%
n 492255
 
7.2%
c 455240
 
6.6%
s 455240
 
6.6%
v 455240
 
6.6%
m 418225
 
6.1%
p 418225
 
6.1%
d 418225
 
6.1%
Other values (5) 222090
 
3.2%
Uppercase Letter
ValueCountFrequency (%)
P 418225
45.9%
S 418225
45.9%
M 37015
 
4.1%
O 37015
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 7776095
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2165155
27.8%
r 873465
11.2%
i 492255
 
6.3%
n 492255
 
6.3%
c 455240
 
5.9%
s 455240
 
5.9%
v 455240
 
5.9%
m 418225
 
5.4%
P 418225
 
5.4%
p 418225
 
5.4%
Other values (9) 1132570
14.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7776095
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2165155
27.8%
r 873465
11.2%
i 492255
 
6.3%
n 492255
 
6.3%
c 455240
 
5.9%
s 455240
 
5.9%
v 455240
 
5.9%
m 418225
 
5.4%
P 418225
 
5.4%
p 418225
 
5.4%
Other values (9) 1132570
14.6%

occurrenceID
Text

Unique 

Distinct455240
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2025-01-14T11:51:03.798423image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters28680120
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique455240 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/30002bab5-5433-4b6c-8496-286a4a697fd7
2nd rowhttp://n2t.net/ark:/65665/3000315ff-b613-4f47-813c-5c48d8e0a883
3rd rowhttp://n2t.net/ark:/65665/3ebef4ab3-d946-4961-9221-c7c9692640f8
4th rowhttp://n2t.net/ark:/65665/3000bbb81-e139-47f8-b2bc-db762804769d
5th rowhttp://n2t.net/ark:/65665/3002333ca-4702-4d0d-93cd-265885eff56a
ValueCountFrequency (%)
http://n2t.net/ark:/65665/30002bab5-5433-4b6c-8496-286a4a697fd7 1
 
< 0.1%
http://n2t.net/ark:/65665/3009c8a06-9eea-4297-a94a-ca0a7f7c4523 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec0c9d7d-35a5-4841-8a5f-f74ef9113d06 1
 
< 0.1%
http://n2t.net/ark:/65665/300319b93-d6b8-4b79-b23d-d3825483b706 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec168a54-17b9-4a71-9ecc-92d446311c64 1
 
< 0.1%
http://n2t.net/ark:/65665/300370a00-b7af-441b-87cd-9c14a7b5b464 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec2b37a7-5b92-4aa2-ad75-a247e8e353f8 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec3b1e55-b813-49b8-83c7-eadc9323514e 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec3f4a3d-f79a-4cba-b8e4-022b57aa27d6 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec57b74b-7fc8-4006-ad93-945fb0784573 1
 
< 0.1%
Other values (455230) 455230
> 99.9%
2025-01-14T11:51:04.217562image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 2276200
 
7.9%
6 2219258
 
7.7%
- 1820960
 
6.3%
t 1820960
 
6.3%
5 1762766
 
6.1%
a 1423594
 
5.0%
2 1308844
 
4.6%
4 1308826
 
4.6%
e 1308771
 
4.6%
3 1308606
 
4.6%
Other values (16) 12121335
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12402915
43.2%
Lowercase Letter 10814325
37.7%
Other Punctuation 3641920
 
12.7%
Dash Punctuation 1820960
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1820960
16.8%
a 1423594
13.2%
e 1308771
12.1%
b 969464
9.0%
n 910480
8.4%
f 853714
7.9%
d 853523
7.9%
c 852859
7.9%
k 455240
 
4.2%
r 455240
 
4.2%
Other values (2) 910480
8.4%
Decimal Number
ValueCountFrequency (%)
6 2219258
17.9%
5 1762766
14.2%
2 1308844
10.6%
4 1308826
10.6%
3 1308606
10.6%
9 967316
7.8%
8 966824
7.8%
0 854278
 
6.9%
1 853655
 
6.9%
7 852542
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 2276200
62.5%
: 910480
 
25.0%
. 455240
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 1820960
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17865795
62.3%
Latin 10814325
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 2276200
12.7%
6 2219258
12.4%
- 1820960
10.2%
5 1762766
9.9%
2 1308844
7.3%
4 1308826
7.3%
3 1308606
7.3%
9 967316
 
5.4%
8 966824
 
5.4%
: 910480
 
5.1%
Other values (4) 3015715
16.9%
Latin
ValueCountFrequency (%)
t 1820960
16.8%
a 1423594
13.2%
e 1308771
12.1%
b 969464
9.0%
n 910480
8.4%
f 853714
7.9%
d 853523
7.9%
c 852859
7.9%
k 455240
 
4.2%
r 455240
 
4.2%
Other values (2) 910480
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28680120
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 2276200
 
7.9%
6 2219258
 
7.7%
- 1820960
 
6.3%
t 1820960
 
6.3%
5 1762766
 
6.1%
a 1423594
 
5.0%
2 1308844
 
4.6%
4 1308826
 
4.6%
e 1308771
 
4.6%
3 1308606
 
4.6%
Other values (16) 12121335
42.3%
Distinct455232
Distinct (%)> 99.9%
Missing3
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-14T11:51:04.665908image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length11
Mean length11.04622647
Min length6

Characters and Unicode

Total characters5028651
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique455227 ?
Unique (%)> 99.9%

Sample

1st rowUSNM 51082
2nd rowUSNM 110432
3rd rowUSNM 49860
4th rowUSNM 239751
5th rowUSNM RAD122557
ValueCountFrequency (%)
usnm 455237
50.0%
465983 2
 
< 0.1%
114351 2
 
< 0.1%
466814 2
 
< 0.1%
rad125895 2
 
< 0.1%
135878 2
 
< 0.1%
fin31989 1
 
< 0.1%
85367 1
 
< 0.1%
253658 1
 
< 0.1%
457486 1
 
< 0.1%
Other values (455223) 455223
50.0%
2025-01-14T11:51:05.316956image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 464952
 
9.2%
U 455237
 
9.1%
S 455237
 
9.1%
M 455237
 
9.1%
455237
 
9.1%
1 348747
 
6.9%
2 335224
 
6.7%
3 330443
 
6.6%
4 286535
 
5.7%
6 228767
 
4.5%
Other values (10) 1213035
24.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2644655
52.6%
Uppercase Letter 1928759
38.4%
Space Separator 455237
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 348747
13.2%
2 335224
12.7%
3 330443
12.5%
4 286535
10.8%
6 228767
8.7%
0 227714
8.6%
7 226608
8.6%
5 223346
8.4%
9 219365
8.3%
8 217906
8.2%
Uppercase Letter
ValueCountFrequency (%)
N 464952
24.1%
U 455237
23.6%
S 455237
23.6%
M 455237
23.6%
R 26222
 
1.4%
A 26222
 
1.4%
D 26222
 
1.4%
F 9715
 
0.5%
I 9715
 
0.5%
Space Separator
ValueCountFrequency (%)
455237
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3099892
61.6%
Latin 1928759
38.4%

Most frequent character per script

Common
ValueCountFrequency (%)
455237
14.7%
1 348747
11.3%
2 335224
10.8%
3 330443
10.7%
4 286535
9.2%
6 228767
7.4%
0 227714
7.3%
7 226608
7.3%
5 223346
7.2%
9 219365
7.1%
Latin
ValueCountFrequency (%)
N 464952
24.1%
U 455237
23.6%
S 455237
23.6%
M 455237
23.6%
R 26222
 
1.4%
A 26222
 
1.4%
D 26222
 
1.4%
F 9715
 
0.5%
I 9715
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5028651
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 464952
 
9.2%
U 455237
 
9.1%
S 455237
 
9.1%
M 455237
 
9.1%
455237
 
9.1%
1 348747
 
6.9%
2 335224
 
6.7%
3 330443
 
6.6%
4 286535
 
5.7%
6 228767
 
4.5%
Other values (10) 1213035
24.1%

recordNumber
Text

Missing 

Distinct20815
Distinct (%)99.9%
Missing434413
Missing (%)95.4%
Memory size3.5 MiB
2025-01-14T11:51:05.559067image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length42
Median length8
Mean length8.388438085
Min length1

Characters and Unicode

Total characters174706
Distinct characters64
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20804 ?
Unique (%)99.9%

Sample

1st rowPHISH-032
2nd rowAUST-251
3rd rowMOC11646
4th rowRP-202
5th rowSCIL-052
ValueCountFrequency (%)
blz 1430
 
5.5%
bah 710
 
2.8%
tci 682
 
2.6%
sms 536
 
2.1%
cur 426
 
1.7%
tob 393
 
1.5%
twn 280
 
1.1%
hbb 157
 
0.6%
fcc 146
 
0.6%
keb&mgg 111
 
0.4%
Other values (18989) 20922
81.1%
2025-01-14T11:51:05.836951image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 16672
 
9.5%
0 13196
 
7.6%
- 11059
 
6.3%
2 9655
 
5.5%
3 7495
 
4.3%
7 6732
 
3.9%
4 6595
 
3.8%
9 6305
 
3.6%
S 6075
 
3.5%
I 5692
 
3.3%
Other values (54) 85230
48.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 83150
47.6%
Uppercase Letter 68576
39.3%
Dash Punctuation 11059
 
6.3%
Lowercase Letter 6363
 
3.6%
Space Separator 4966
 
2.8%
Connector Punctuation 199
 
0.1%
Other Punctuation 135
 
0.1%
Open Punctuation 129
 
0.1%
Close Punctuation 129
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 6075
 
8.9%
I 5692
 
8.3%
H 4851
 
7.1%
C 4742
 
6.9%
R 4572
 
6.7%
B 4223
 
6.2%
P 3828
 
5.6%
L 3820
 
5.6%
M 3805
 
5.5%
U 3666
 
5.3%
Other values (16) 23302
34.0%
Lowercase Letter
ValueCountFrequency (%)
i 1435
22.6%
m 1341
21.1%
b 1290
20.3%
o 1289
20.3%
n 256
 
4.0%
a 147
 
2.3%
y 140
 
2.2%
u 138
 
2.2%
q 136
 
2.1%
t 133
 
2.1%
Other values (7) 58
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 16672
20.1%
0 13196
15.9%
2 9655
11.6%
3 7495
9.0%
7 6732
8.1%
4 6595
 
7.9%
9 6305
 
7.6%
8 5653
 
6.8%
5 5523
 
6.6%
6 5324
 
6.4%
Other Punctuation
ValueCountFrequency (%)
& 111
82.2%
. 9
 
6.7%
; 9
 
6.7%
* 4
 
3.0%
: 1
 
0.7%
? 1
 
0.7%
Dash Punctuation
ValueCountFrequency (%)
- 11059
100.0%
Space Separator
ValueCountFrequency (%)
4966
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 199
100.0%
Open Punctuation
ValueCountFrequency (%)
( 129
100.0%
Close Punctuation
ValueCountFrequency (%)
) 129
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 99767
57.1%
Latin 74939
42.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 6075
 
8.1%
I 5692
 
7.6%
H 4851
 
6.5%
C 4742
 
6.3%
R 4572
 
6.1%
B 4223
 
5.6%
P 3828
 
5.1%
L 3820
 
5.1%
M 3805
 
5.1%
U 3666
 
4.9%
Other values (33) 29665
39.6%
Common
ValueCountFrequency (%)
1 16672
16.7%
0 13196
13.2%
- 11059
11.1%
2 9655
9.7%
3 7495
7.5%
7 6732
6.7%
4 6595
 
6.6%
9 6305
 
6.3%
8 5653
 
5.7%
5 5523
 
5.5%
Other values (11) 10882
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 174706
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 16672
 
9.5%
0 13196
 
7.6%
- 11059
 
6.3%
2 9655
 
5.5%
3 7495
 
4.3%
7 6732
 
3.9%
4 6595
 
3.8%
9 6305
 
3.6%
S 6075
 
3.5%
I 5692
 
3.3%
Other values (54) 85230
48.8%

recordedBy
Text

Missing 

Distinct7883
Distinct (%)4.7%
Missing287329
Missing (%)63.1%
Memory size3.5 MiB
2025-01-14T11:51:06.043688image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length240
Median length115
Mean length26.11755632
Min length1

Characters and Unicode

Total characters4385425
Distinct characters76
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3022 ?
Unique (%)1.8%

Sample

1st rowJ. Snyder
2nd rowD. Richardson
3rd rowSmithsonian Team, A. Alcala & Silliman University Group
4th rowBronson
5th rowG. Hendler
ValueCountFrequency (%)
77672
 
9.1%
j 42742
 
5.0%
m 36874
 
4.3%
d 28294
 
3.3%
r 27607
 
3.2%
c 22146
 
2.6%
l 20147
 
2.3%
h 19638
 
2.3%
s 18374
 
2.1%
a 17771
 
2.1%
Other values (4981) 546448
63.7%
2025-01-14T11:51:06.503444image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
689802
15.7%
. 355009
 
8.1%
e 287112
 
6.5%
a 267906
 
6.1%
r 202156
 
4.6%
n 201841
 
4.6%
i 198862
 
4.5%
o 170917
 
3.9%
l 161888
 
3.7%
t 156536
 
3.6%
Other values (66) 1693396
38.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2340244
53.4%
Uppercase Letter 780660
 
17.8%
Space Separator 689802
 
15.7%
Other Punctuation 562453
 
12.8%
Dash Punctuation 6626
 
0.2%
Open Punctuation 2739
 
0.1%
Close Punctuation 2739
 
0.1%
Decimal Number 162
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 287112
12.3%
a 267906
11.4%
r 202156
8.6%
n 201841
8.6%
i 198862
8.5%
o 170917
 
7.3%
l 161888
 
6.9%
t 156536
 
6.7%
s 132880
 
5.7%
h 80008
 
3.4%
Other values (19) 480138
20.5%
Uppercase Letter
ValueCountFrequency (%)
M 78823
 
10.1%
S 64729
 
8.3%
C 58546
 
7.5%
B 55967
 
7.2%
J 53700
 
6.9%
R 51839
 
6.6%
H 41817
 
5.4%
P 41068
 
5.3%
D 40848
 
5.2%
W 39861
 
5.1%
Other values (16) 253462
32.5%
Decimal Number
ValueCountFrequency (%)
9 39
24.1%
0 26
16.0%
1 21
13.0%
8 21
13.0%
3 17
10.5%
2 16
9.9%
7 11
 
6.8%
6 5
 
3.1%
4 5
 
3.1%
5 1
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 355009
63.1%
, 129604
 
23.0%
& 77583
 
13.8%
' 240
 
< 0.1%
# 10
 
< 0.1%
? 5
 
< 0.1%
/ 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
689802
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6626
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2739
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2739
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3120904
71.2%
Common 1264521
28.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 287112
 
9.2%
a 267906
 
8.6%
r 202156
 
6.5%
n 201841
 
6.5%
i 198862
 
6.4%
o 170917
 
5.5%
l 161888
 
5.2%
t 156536
 
5.0%
s 132880
 
4.3%
h 80008
 
2.6%
Other values (45) 1260798
40.4%
Common
ValueCountFrequency (%)
689802
54.6%
. 355009
28.1%
, 129604
 
10.2%
& 77583
 
6.1%
- 6626
 
0.5%
( 2739
 
0.2%
) 2739
 
0.2%
' 240
 
< 0.1%
9 39
 
< 0.1%
0 26
 
< 0.1%
Other values (11) 114
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4385388
> 99.9%
None 37
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
689802
15.7%
. 355009
 
8.1%
e 287112
 
6.5%
a 267906
 
6.1%
r 202156
 
4.6%
n 201841
 
4.6%
i 198862
 
4.5%
o 170917
 
3.9%
l 161888
 
3.7%
t 156536
 
3.6%
Other values (63) 1693359
38.6%
None
ValueCountFrequency (%)
ü 32
86.5%
ô 4
 
10.8%
í 1
 
2.7%
Distinct619
Distinct (%)0.1%
Missing15
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-14T11:51:06.738633image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length1
Mean length1.121069801
Min length1

Characters and Unicode

Total characters510339
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique251 ?
Unique (%)0.1%

Sample

1st row1
2nd row1
3rd row9
4th row12
5th row1
ValueCountFrequency (%)
1 245725
54.0%
2 61720
 
13.6%
3 30726
 
6.7%
4 19437
 
4.3%
5 14095
 
3.1%
6 10261
 
2.3%
7 7454
 
1.6%
10 6775
 
1.5%
8 6056
 
1.3%
9 4855
 
1.1%
Other values (609) 48121
 
10.6%
2025-01-14T11:51:06.978562image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 279529
54.8%
2 78474
 
15.4%
3 39933
 
7.8%
4 26279
 
5.1%
5 23286
 
4.6%
0 18677
 
3.7%
6 14875
 
2.9%
7 11614
 
2.3%
8 9548
 
1.9%
9 8124
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 510339
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 279529
54.8%
2 78474
 
15.4%
3 39933
 
7.8%
4 26279
 
5.1%
5 23286
 
4.6%
0 18677
 
3.7%
6 14875
 
2.9%
7 11614
 
2.3%
8 9548
 
1.9%
9 8124
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Common 510339
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 279529
54.8%
2 78474
 
15.4%
3 39933
 
7.8%
4 26279
 
5.1%
5 23286
 
4.6%
0 18677
 
3.7%
6 14875
 
2.9%
7 11614
 
2.3%
8 9548
 
1.9%
9 8124
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 510339
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 279529
54.8%
2 78474
 
15.4%
3 39933
 
7.8%
4 26279
 
5.1%
5 23286
 
4.6%
0 18677
 
3.7%
6 14875
 
2.9%
7 11614
 
2.3%
8 9548
 
1.9%
9 8124
 
1.6%

sex
Text

Constant  Missing 

Distinct1
Distinct (%)33.3%
Missing455237
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:07.027559image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters12
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowmale
3rd rowmale
ValueCountFrequency (%)
male 3
100.0%
2025-01-14T11:51:07.130793image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
m 3
25.0%
a 3
25.0%
l 3
25.0%
e 3
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 3
25.0%
a 3
25.0%
l 3
25.0%
e 3
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
m 3
25.0%
a 3
25.0%
l 3
25.0%
e 3
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m 3
25.0%
a 3
25.0%
l 3
25.0%
e 3
25.0%

preparations
Text

Missing 

Distinct325
Distinct (%)0.3%
Missing346202
Missing (%)76.0%
Memory size3.5 MiB
2025-01-14T11:51:07.209724image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length255
Median length192
Mean length11.80385737
Min length4

Characters and Unicode

Total characters1287069
Distinct characters56
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique141 ?
Unique (%)0.1%

Sample

1st rowDry Osteological Specimen
2nd rowGlycerin with Bone Stain
3rd rowPolyester
4th rowLarvae [ETOH Fixed]
5th rowUnknown
ValueCountFrequency (%)
larvae 25642
14.6%
polyester 20068
 
11.4%
photograph 14071
 
8.0%
unknown 11507
 
6.6%
film 9617
 
5.5%
specimen 8058
 
4.6%
osteological 7027
 
4.0%
glycerin 7020
 
4.0%
with 7018
 
4.0%
stain 7013
 
4.0%
Other values (60) 58281
33.2%
2025-01-14T11:51:07.368584image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 130557
 
10.1%
a 117251
 
9.1%
o 94966
 
7.4%
r 91536
 
7.1%
t 83192
 
6.5%
n 69634
 
5.4%
l 66769
 
5.2%
i 66644
 
5.2%
66284
 
5.1%
h 41881
 
3.3%
Other values (46) 458355
35.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1023360
79.5%
Uppercase Letter 182972
 
14.2%
Space Separator 66284
 
5.1%
Other Punctuation 6531
 
0.5%
Open Punctuation 3928
 
0.3%
Close Punctuation 3928
 
0.3%
Dash Punctuation 60
 
< 0.1%
Decimal Number 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 130557
12.8%
a 117251
11.5%
o 94966
9.3%
r 91536
8.9%
t 83192
 
8.1%
n 69634
 
6.8%
l 66769
 
6.5%
i 66644
 
6.5%
h 41881
 
4.1%
y 35280
 
3.4%
Other values (13) 225650
22.0%
Uppercase Letter
ValueCountFrequency (%)
P 34355
18.8%
L 26511
14.5%
S 15787
8.6%
F 15062
8.2%
O 12531
 
6.8%
U 11507
 
6.3%
D 9090
 
5.0%
E 7135
 
3.9%
G 7020
 
3.8%
A 6982
 
3.8%
Other values (11) 36992
20.2%
Other Punctuation
ValueCountFrequency (%)
; 6173
94.5%
. 186
 
2.8%
& 89
 
1.4%
, 78
 
1.2%
% 5
 
0.1%
Decimal Number
ValueCountFrequency (%)
3 4
66.7%
9 1
 
16.7%
5 1
 
16.7%
Space Separator
ValueCountFrequency (%)
66284
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 3928
100.0%
Close Punctuation
ValueCountFrequency (%)
] 3928
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 60
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1206332
93.7%
Common 80737
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 130557
 
10.8%
a 117251
 
9.7%
o 94966
 
7.9%
r 91536
 
7.6%
t 83192
 
6.9%
n 69634
 
5.8%
l 66769
 
5.5%
i 66644
 
5.5%
h 41881
 
3.5%
y 35280
 
2.9%
Other values (34) 408622
33.9%
Common
ValueCountFrequency (%)
66284
82.1%
; 6173
 
7.6%
[ 3928
 
4.9%
] 3928
 
4.9%
. 186
 
0.2%
& 89
 
0.1%
, 78
 
0.1%
- 60
 
0.1%
% 5
 
< 0.1%
3 4
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1287069
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 130557
 
10.1%
a 117251
 
9.1%
o 94966
 
7.4%
r 91536
 
7.1%
t 83192
 
6.5%
n 69634
 
5.4%
l 66769
 
5.2%
i 66644
 
5.2%
66284
 
5.1%
h 41881
 
3.3%
Other values (46) 458355
35.6%

associatedMedia
Text

Missing 

Distinct59711
Distinct (%)63.5%
Missing361191
Missing (%)79.3%
Memory size3.5 MiB
2025-01-14T11:51:07.468086image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length579
Median length49
Mean length55.87516082
Min length48

Characters and Unicode

Total characters5255003
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52407 ?
Unique (%)55.7%

Sample

1st rowhttps://collections.nmnh.si.edu/media/?i=10977965
2nd rowhttps://collections.nmnh.si.edu/media/?i=13093375
3rd rowhttps://collections.nmnh.si.edu/media/?i=5000406; 14894714
4th rowhttps://collections.nmnh.si.edu/media/?i=13314074
5th rowhttps://collections.nmnh.si.edu/media/?i=12868276
ValueCountFrequency (%)
14558510 3137
 
1.9%
14894714 3118
 
1.9%
14888503 2880
 
1.8%
14888504 2083
 
1.3%
5000375 2034
 
1.3%
5000376 2034
 
1.3%
15777181 1441
 
0.9%
15596573 1414
 
0.9%
5000606 1251
 
0.8%
https://collections.nmnh.si.edu/media/?i=5004168 1238
 
0.8%
Other values (84688) 142051
87.3%
2025-01-14T11:51:07.638523image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 376196
 
7.2%
i 376196
 
7.2%
e 282147
 
5.4%
n 282147
 
5.4%
s 282147
 
5.4%
t 282147
 
5.4%
. 282147
 
5.4%
1 238589
 
4.5%
0 221459
 
4.2%
d 188098
 
3.6%
Other values (21) 2443730
46.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2915519
55.5%
Decimal Number 1261730
24.0%
Other Punctuation 915073
 
17.4%
Math Symbol 94049
 
1.8%
Space Separator 68632
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 376196
12.9%
e 282147
9.7%
n 282147
9.7%
s 282147
9.7%
t 282147
9.7%
d 188098
 
6.5%
m 188098
 
6.5%
h 188098
 
6.5%
l 188098
 
6.5%
o 188098
 
6.5%
Other values (4) 470245
16.1%
Decimal Number
ValueCountFrequency (%)
1 238589
18.9%
0 221459
17.6%
5 157456
12.5%
4 112243
8.9%
8 104762
8.3%
3 92948
 
7.4%
9 89804
 
7.1%
7 85275
 
6.8%
6 84213
 
6.7%
2 74981
 
5.9%
Other Punctuation
ValueCountFrequency (%)
/ 376196
41.1%
. 282147
30.8%
? 94049
 
10.3%
: 94049
 
10.3%
; 68632
 
7.5%
Math Symbol
ValueCountFrequency (%)
= 94049
100.0%
Space Separator
ValueCountFrequency (%)
68632
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2915519
55.5%
Common 2339484
44.5%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 376196
16.1%
. 282147
12.1%
1 238589
10.2%
0 221459
 
9.5%
5 157456
 
6.7%
4 112243
 
4.8%
8 104762
 
4.5%
= 94049
 
4.0%
? 94049
 
4.0%
: 94049
 
4.0%
Other values (7) 564485
24.1%
Latin
ValueCountFrequency (%)
i 376196
12.9%
e 282147
9.7%
n 282147
9.7%
s 282147
9.7%
t 282147
9.7%
d 188098
 
6.5%
m 188098
 
6.5%
h 188098
 
6.5%
l 188098
 
6.5%
o 188098
 
6.5%
Other values (4) 470245
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5255003
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 376196
 
7.2%
i 376196
 
7.2%
e 282147
 
5.4%
n 282147
 
5.4%
s 282147
 
5.4%
t 282147
 
5.4%
. 282147
 
5.4%
1 238589
 
4.5%
0 221459
 
4.2%
d 188098
 
3.6%
Other values (21) 2443730
46.5%

associatedSequences
Text

Missing 

Distinct447
Distinct (%)99.3%
Missing454790
Missing (%)99.9%
Memory size3.5 MiB
2025-01-14T11:51:07.719859image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length249
Median length49
Mean length59.88888889
Min length49

Characters and Unicode

Total characters26950
Distinct characters51
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique444 ?
Unique (%)98.7%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=FJ609901
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=HQ600890
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=HM748411
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=HQ600884
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MN621852
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=mn549761 2
 
0.4%
https://www.ncbi.nlm.nih.gov/gquery?term=hq543050 2
 
0.4%
https://www.ncbi.nlm.nih.gov/gquery?term=hq543049 2
 
0.4%
https://www.ncbi.nlm.nih.gov/gquery?term=gq367323 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=hq543043 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=hq600884 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=mn621852 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=hq325698|https://www.ncbi.nlm.nih.gov/gquery?term=hq325631 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=hm748370 1
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=ef536294|https://www.ncbi.nlm.nih.gov/gquery?term=ef536256|https://www.ncbi.nlm.nih.gov/gquery?term=ef539241|https://www.ncbi.nlm.nih.gov/gquery?term=ef533917|https://www.ncbi.nlm.nih.gov/gquery?term=ef530094 1
 
0.2%
Other values (437) 437
97.1%
2025-01-14T11:51:07.859045image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 2192
 
8.1%
t 1644
 
6.1%
/ 1644
 
6.1%
w 1644
 
6.1%
n 1644
 
6.1%
h 1096
 
4.1%
r 1096
 
4.1%
e 1096
 
4.1%
i 1096
 
4.1%
m 1096
 
4.1%
Other values (41) 12702
47.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16988
63.0%
Other Punctuation 4932
 
18.3%
Decimal Number 3288
 
12.2%
Uppercase Letter 1096
 
4.1%
Math Symbol 646
 
2.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1644
 
9.7%
w 1644
 
9.7%
n 1644
 
9.7%
h 1096
 
6.5%
r 1096
 
6.5%
e 1096
 
6.5%
i 1096
 
6.5%
m 1096
 
6.5%
g 1096
 
6.5%
v 548
 
3.2%
Other values (9) 4932
29.0%
Uppercase Letter
ValueCountFrequency (%)
Q 219
20.0%
H 165
15.1%
F 141
12.9%
M 122
11.1%
G 99
9.0%
J 85
 
7.8%
A 83
 
7.6%
N 75
 
6.8%
Y 30
 
2.7%
E 29
 
2.6%
Other values (6) 48
 
4.4%
Decimal Number
ValueCountFrequency (%)
6 439
13.4%
0 417
12.7%
3 406
12.3%
4 396
12.0%
7 382
11.6%
9 352
10.7%
5 289
8.8%
8 261
7.9%
2 201
6.1%
1 145
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 2192
44.4%
/ 1644
33.3%
? 548
 
11.1%
: 548
 
11.1%
Math Symbol
ValueCountFrequency (%)
= 548
84.8%
| 98
 
15.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 18084
67.1%
Common 8866
32.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 1644
 
9.1%
w 1644
 
9.1%
n 1644
 
9.1%
h 1096
 
6.1%
r 1096
 
6.1%
e 1096
 
6.1%
i 1096
 
6.1%
m 1096
 
6.1%
g 1096
 
6.1%
v 548
 
3.0%
Other values (25) 6028
33.3%
Common
ValueCountFrequency (%)
. 2192
24.7%
/ 1644
18.5%
= 548
 
6.2%
? 548
 
6.2%
: 548
 
6.2%
6 439
 
5.0%
0 417
 
4.7%
3 406
 
4.6%
4 396
 
4.5%
7 382
 
4.3%
Other values (6) 1346
15.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26950
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 2192
 
8.1%
t 1644
 
6.1%
/ 1644
 
6.1%
w 1644
 
6.1%
n 1644
 
6.1%
h 1096
 
4.1%
r 1096
 
4.1%
e 1096
 
4.1%
i 1096
 
4.1%
m 1096
 
4.1%
Other values (41) 12702
47.1%

occurrenceRemarks
Text

Missing 

Distinct80322
Distinct (%)48.8%
Missing290501
Missing (%)63.8%
Memory size3.5 MiB
2025-01-14T11:51:08.066875image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length146530
Median length18368
Mean length70.30704933
Min length1

Characters and Unicode

Total characters11582313
Distinct characters115
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70653 ?
Unique (%)42.9%

Sample

1st rowNote in ledger: " pair of otoliths"; Otoliths are stored in the Osteo Collection.; Stored in Osteo Collection.; The ototliths are stored in Mugil Box 1 of 1, which contains catalog numbers: 110428, 110429, 110430, 110431, 110432, 110433, 110434, 110435, 110436, 110438, 110439, 110440, and 110441.
2nd rowCat. no. 105
3rd rowHost-bohadschia argus. rec from: truett, d. f.
4th rowSpecimen measurements as written on the slide mount: SL (mm)= 205; TL (mm)= 10" (254); This material is part of the John and Helen Randall Slide Collection. The slides were digitized October 2017. The Randall donation includes all intellectual property rights.; Black paint/goop on the film. Not obscuring specimen.
5th rowSpecimen measurements as written on the slide mount: SL (mm)= 57; TL (mm)= 2.8" (71); This material is part of the John and Helen Randall Slide Collection. The slides were digitized October 2017. The Randall donation includes all intellectual property rights.
ValueCountFrequency (%)
the 73562
 
4.0%
of 48916
 
2.7%
in 34957
 
1.9%
and 29491
 
1.6%
mm 26664
 
1.5%
collection 24234
 
1.3%
specimen 23150
 
1.3%
as 22911
 
1.3%
is 22746
 
1.2%
1 20761
 
1.1%
Other values (77718) 1496634
82.1%
2025-01-14T11:51:08.490935image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1591640
 
13.7%
e 923937
 
8.0%
i 617295
 
5.3%
a 604169
 
5.2%
t 600600
 
5.2%
n 577716
 
5.0%
o 572900
 
4.9%
r 444589
 
3.8%
s 444532
 
3.8%
l 442349
 
3.8%
Other values (105) 4762586
41.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7440215
64.2%
Space Separator 1591640
 
13.7%
Decimal Number 959199
 
8.3%
Uppercase Letter 617506
 
5.3%
Other Punctuation 481608
 
4.2%
Control 352108
 
3.0%
Dash Punctuation 51162
 
0.4%
Open Punctuation 30356
 
0.3%
Close Punctuation 30344
 
0.3%
Math Symbol 27575
 
0.2%
Other values (6) 600
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 923937
12.4%
i 617295
 
8.3%
a 604169
 
8.1%
t 600600
 
8.1%
n 577716
 
7.8%
o 572900
 
7.7%
r 444589
 
6.0%
s 444532
 
6.0%
l 442349
 
5.9%
d 333164
 
4.5%
Other values (24) 1878964
25.3%
Uppercase Letter
ValueCountFrequency (%)
S 94017
15.2%
T 65766
10.7%
L 45758
 
7.4%
N 44811
 
7.3%
O 43368
 
7.0%
C 38553
 
6.2%
R 37854
 
6.1%
M 33592
 
5.4%
A 31872
 
5.2%
P 24341
 
3.9%
Other values (21) 157574
25.5%
Other Punctuation
ValueCountFrequency (%)
. 257062
53.4%
, 74594
 
15.5%
: 49847
 
10.4%
; 42979
 
8.9%
/ 20610
 
4.3%
" 18849
 
3.9%
' 6994
 
1.5%
# 6654
 
1.4%
& 2161
 
0.4%
? 1162
 
0.2%
Other values (6) 696
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 173082
18.0%
0 131583
13.7%
2 126840
13.2%
3 94426
9.8%
9 86779
9.0%
5 76238
7.9%
4 71583
7.5%
7 68822
 
7.2%
6 66516
 
6.9%
8 63330
 
6.6%
Math Symbol
ValueCountFrequency (%)
= 25891
93.9%
+ 1644
 
6.0%
~ 31
 
0.1%
< 4
 
< 0.1%
> 4
 
< 0.1%
| 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 30125
99.2%
[ 213
 
0.7%
{ 18
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 30115
99.2%
] 213
 
0.7%
} 16
 
0.1%
Control
ValueCountFrequency (%)
350257
99.5%
1851
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 50620
98.9%
542
 
1.1%
Final Punctuation
ValueCountFrequency (%)
35
67.3%
17
32.7%
Space Separator
ValueCountFrequency (%)
1591640
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 512
100.0%
Initial Punctuation
ValueCountFrequency (%)
23
100.0%
Other Symbol
ValueCountFrequency (%)
° 8
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 3
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8057721
69.6%
Common 3524592
30.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 923937
 
11.5%
i 617295
 
7.7%
a 604169
 
7.5%
t 600600
 
7.5%
n 577716
 
7.2%
o 572900
 
7.1%
r 444589
 
5.5%
s 444532
 
5.5%
l 442349
 
5.5%
d 333164
 
4.1%
Other values (55) 2496470
31.0%
Common
ValueCountFrequency (%)
1591640
45.2%
350257
 
9.9%
. 257062
 
7.3%
1 173082
 
4.9%
0 131583
 
3.7%
2 126840
 
3.6%
3 94426
 
2.7%
9 86779
 
2.5%
5 76238
 
2.2%
, 74594
 
2.1%
Other values (40) 562091
 
15.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11581351
> 99.9%
Punctuation 617
 
< 0.1%
None 345
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1591640
 
13.7%
e 923937
 
8.0%
i 617295
 
5.3%
a 604169
 
5.2%
t 600600
 
5.2%
n 577716
 
5.0%
o 572900
 
4.9%
r 444589
 
3.8%
s 444532
 
3.8%
l 442349
 
3.8%
Other values (86) 4761624
41.1%
Punctuation
ValueCountFrequency (%)
542
87.8%
35
 
5.7%
23
 
3.7%
17
 
2.8%
None
ValueCountFrequency (%)
ã 158
45.8%
é 52
 
15.1%
á 34
 
9.9%
è 23
 
6.7%
ó 22
 
6.4%
ê 21
 
6.1%
í 12
 
3.5%
° 8
 
2.3%
Ë 5
 
1.4%
· 2
 
0.6%
Other values (5) 8
 
2.3%

organismID
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing455238
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:08.552679image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters32
Distinct characters11
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row68° 53' 56.61" W
2nd row68° 53' 56.61" W
ValueCountFrequency (%)
68° 2
25.0%
53 2
25.0%
56.61 2
25.0%
w 2
25.0%
2025-01-14T11:51:08.664110image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 6
18.8%
6
18.8%
5 4
12.5%
8 2
 
6.2%
° 2
 
6.2%
3 2
 
6.2%
' 2
 
6.2%
. 2
 
6.2%
1 2
 
6.2%
" 2
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16
50.0%
Space Separator 6
 
18.8%
Other Punctuation 6
 
18.8%
Other Symbol 2
 
6.2%
Uppercase Letter 2
 
6.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 6
37.5%
5 4
25.0%
8 2
 
12.5%
3 2
 
12.5%
1 2
 
12.5%
Other Punctuation
ValueCountFrequency (%)
' 2
33.3%
. 2
33.3%
" 2
33.3%
Space Separator
ValueCountFrequency (%)
6
100.0%
Other Symbol
ValueCountFrequency (%)
° 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
W 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30
93.8%
Latin 2
 
6.2%

Most frequent character per script

Common
ValueCountFrequency (%)
6 6
20.0%
6
20.0%
5 4
13.3%
8 2
 
6.7%
° 2
 
6.7%
3 2
 
6.7%
' 2
 
6.7%
. 2
 
6.7%
1 2
 
6.7%
" 2
 
6.7%
Latin
ValueCountFrequency (%)
W 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30
93.8%
None 2
 
6.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 6
20.0%
6
20.0%
5 4
13.3%
8 2
 
6.7%
3 2
 
6.7%
' 2
 
6.7%
. 2
 
6.7%
1 2
 
6.7%
" 2
 
6.7%
W 2
 
6.7%
None
ValueCountFrequency (%)
° 2
100.0%

organismRemarks
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing455237
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:08.717686image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters9
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st row0.0
2nd row8.0
3rd row0.0
ValueCountFrequency (%)
0.0 2
66.7%
8.0 1
33.3%
2025-01-14T11:51:08.870426image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5
55.6%
. 3
33.3%
8 1
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
66.7%
Other Punctuation 3
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5
83.3%
8 1
 
16.7%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5
55.6%
. 3
33.3%
8 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5
55.6%
. 3
33.3%
8 1
 
11.1%

materialEntityID
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing455237
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:08.949099image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters9
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row1.0
2nd row9.0
3rd row4.0
ValueCountFrequency (%)
1.0 1
33.3%
9.0 1
33.3%
4.0 1
33.3%
2025-01-14T11:51:09.099535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 3
33.3%
0 3
33.3%
1 1
 
11.1%
9 1
 
11.1%
4 1
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
66.7%
Other Punctuation 3
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3
50.0%
1 1
 
16.7%
9 1
 
16.7%
4 1
 
16.7%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 3
33.3%
0 3
33.3%
1 1
 
11.1%
9 1
 
11.1%
4 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 3
33.3%
0 3
33.3%
1 1
 
11.1%
9 1
 
11.1%
4 1
 
11.1%

eventType
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing455237
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:09.161969image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.666666667
Min length6

Characters and Unicode

Total characters20
Distinct characters11
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row-9.3883
2nd row10.6925
3rd row7.0083
ValueCountFrequency (%)
9.3883 1
33.3%
10.6925 1
33.3%
7.0083 1
33.3%
2025-01-14T11:51:09.309071image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 3
15.0%
3 3
15.0%
8 3
15.0%
0 3
15.0%
9 2
10.0%
- 1
 
5.0%
1 1
 
5.0%
6 1
 
5.0%
2 1
 
5.0%
5 1
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16
80.0%
Other Punctuation 3
 
15.0%
Dash Punctuation 1
 
5.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 3
18.8%
8 3
18.8%
0 3
18.8%
9 2
12.5%
1 1
 
6.2%
6 1
 
6.2%
2 1
 
6.2%
5 1
 
6.2%
7 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 20
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 3
15.0%
3 3
15.0%
8 3
15.0%
0 3
15.0%
9 2
10.0%
- 1
 
5.0%
1 1
 
5.0%
6 1
 
5.0%
2 1
 
5.0%
5 1
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 3
15.0%
3 3
15.0%
8 3
15.0%
0 3
15.0%
9 2
10.0%
- 1
 
5.0%
1 1
 
5.0%
6 1
 
5.0%
2 1
 
5.0%
5 1
 
5.0%

fieldNumber
Text

Missing 

Distinct25293
Distinct (%)14.0%
Missing274230
Missing (%)60.2%
Memory size3.5 MiB
2025-01-14T11:51:09.583293image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length149
Median length70
Mean length10.07333297
Min length1

Characters and Unicode

Total characters1823374
Distinct characters82
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10525 ?
Unique (%)5.8%

Sample

1st rowFJS-455
2nd rowM10-97B4 (40-60m)
3rd rowSP 78-18
4th rowBBC 1734 A; M-84
5th rowPHISH-2016-05; SIA-06
ValueCountFrequency (%)
vgs 19290
 
5.7%
jtw 14298
 
4.2%
bbc 6125
 
1.8%
lwk 4274
 
1.3%
lk 4258
 
1.3%
sol 3414
 
1.0%
rpv 3292
 
1.0%
sp 3134
 
0.9%
bayley 2740
 
0.8%
lrp 2643
 
0.8%
Other values (22433) 275098
81.3%
2025-01-14T11:51:09.888977image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 190034
 
10.4%
157556
 
8.6%
1 125561
 
6.9%
0 109954
 
6.0%
2 103382
 
5.7%
9 89328
 
4.9%
6 82133
 
4.5%
7 76846
 
4.2%
3 73154
 
4.0%
8 68615
 
3.8%
Other values (72) 746811
41.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 855931
46.9%
Uppercase Letter 548159
30.1%
Dash Punctuation 190034
 
10.4%
Space Separator 157556
 
8.6%
Other Punctuation 39952
 
2.2%
Lowercase Letter 29687
 
1.6%
Close Punctuation 995
 
0.1%
Open Punctuation 994
 
0.1%
Math Symbol 62
 
< 0.1%
Final Punctuation 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 54494
 
9.9%
A 45930
 
8.4%
V 35511
 
6.5%
L 34670
 
6.3%
W 32879
 
6.0%
T 31272
 
5.7%
B 30331
 
5.5%
G 27885
 
5.1%
C 27372
 
5.0%
M 26823
 
4.9%
Other values (16) 200992
36.7%
Lowercase Letter
ValueCountFrequency (%)
o 5686
19.2%
t 3330
11.2%
e 3298
11.1%
a 3141
10.6%
r 2573
8.7%
i 1865
 
6.3%
n 1550
 
5.2%
u 1505
 
5.1%
m 1427
 
4.8%
l 1415
 
4.8%
Other values (15) 3897
13.1%
Other Punctuation
ValueCountFrequency (%)
; 32939
82.4%
. 4329
 
10.8%
, 1177
 
2.9%
# 917
 
2.3%
: 267
 
0.7%
/ 177
 
0.4%
& 60
 
0.2%
' 43
 
0.1%
? 31
 
0.1%
" 6
 
< 0.1%
Other values (2) 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 125561
14.7%
0 109954
12.8%
2 103382
12.1%
9 89328
10.4%
6 82133
9.6%
7 76846
9.0%
3 73154
8.5%
8 68615
8.0%
4 65365
7.6%
5 61593
7.2%
Close Punctuation
ValueCountFrequency (%)
) 992
99.7%
] 3
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 991
99.7%
[ 3
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 190034
100.0%
Space Separator
ValueCountFrequency (%)
157556
100.0%
Math Symbol
ValueCountFrequency (%)
+ 62
100.0%
Final Punctuation
ValueCountFrequency (%)
3
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1245528
68.3%
Latin 577846
31.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 54494
 
9.4%
A 45930
 
7.9%
V 35511
 
6.1%
L 34670
 
6.0%
W 32879
 
5.7%
T 31272
 
5.4%
B 30331
 
5.2%
G 27885
 
4.8%
C 27372
 
4.7%
M 26823
 
4.6%
Other values (41) 230679
39.9%
Common
ValueCountFrequency (%)
- 190034
15.3%
157556
12.6%
1 125561
10.1%
0 109954
8.8%
2 103382
8.3%
9 89328
7.2%
6 82133
6.6%
7 76846
6.2%
3 73154
 
5.9%
8 68615
 
5.5%
Other values (21) 168965
13.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1823371
> 99.9%
Punctuation 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 190034
 
10.4%
157556
 
8.6%
1 125561
 
6.9%
0 109954
 
6.0%
2 103382
 
5.7%
9 89328
 
4.9%
6 82133
 
4.5%
7 76846
 
4.2%
3 73154
 
4.0%
8 68615
 
3.8%
Other values (71) 746808
41.0%
Punctuation
ValueCountFrequency (%)
3
100.0%

eventDate
Text

Missing 

Distinct31124
Distinct (%)7.8%
Missing57339
Missing (%)12.6%
Memory size3.5 MiB
2025-01-14T11:51:10.128433image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length10
Mean length10.1725957
Min length4

Characters and Unicode

Total characters4047686
Distinct characters24
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8378 ?
Unique (%)2.1%

Sample

1st row1938-03-25
2nd row1956-05-30
3rd row1997-05-10
4th row1978-05-22
5th row1928-02-10
ValueCountFrequency (%)
1906 1424
 
0.4%
1888 1089
 
0.3%
1902 1068
 
0.3%
1889 931
 
0.2%
1994-05-06 927
 
0.2%
1994-04-30 702
 
0.2%
1901 577
 
0.1%
1880 566
 
0.1%
1970-09-11/1970-09-16 510
 
0.1%
1893 439
 
0.1%
Other values (31087) 389868
97.9%
2025-01-14T11:51:10.428476image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 780232
19.3%
1 741851
18.3%
0 654489
16.2%
9 535927
13.2%
2 290388
 
7.2%
8 218774
 
5.4%
6 185334
 
4.6%
7 184332
 
4.6%
5 152860
 
3.8%
3 143070
 
3.5%
Other values (14) 160429
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3244130
80.1%
Dash Punctuation 780232
 
19.3%
Other Punctuation 22917
 
0.6%
Space Separator 200
 
< 0.1%
Lowercase Letter 198
 
< 0.1%
Uppercase Letter 7
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 741851
22.9%
0 654489
20.2%
9 535927
16.5%
2 290388
 
9.0%
8 218774
 
6.7%
6 185334
 
5.7%
7 184332
 
5.7%
5 152860
 
4.7%
3 143070
 
4.4%
4 137105
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
G 2
28.6%
S 2
28.6%
W 1
14.3%
E 1
14.3%
P 1
14.3%
Other Punctuation
ValueCountFrequency (%)
/ 22478
98.1%
, 438
 
1.9%
: 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
o 99
50.0%
r 99
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 780232
100.0%
Space Separator
ValueCountFrequency (%)
200
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4047481
> 99.9%
Latin 205
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 780232
19.3%
1 741851
18.3%
0 654489
16.2%
9 535927
13.2%
2 290388
 
7.2%
8 218774
 
5.4%
6 185334
 
4.6%
7 184332
 
4.6%
5 152860
 
3.8%
3 143070
 
3.5%
Other values (7) 160224
 
4.0%
Latin
ValueCountFrequency (%)
o 99
48.3%
r 99
48.3%
G 2
 
1.0%
S 2
 
1.0%
W 1
 
0.5%
E 1
 
0.5%
P 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4047686
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 780232
19.3%
1 741851
18.3%
0 654489
16.2%
9 535927
13.2%
2 290388
 
7.2%
8 218774
 
5.4%
6 185334
 
4.6%
7 184332
 
4.6%
5 152860
 
3.8%
3 143070
 
3.5%
Other values (14) 160429
 
4.0%

eventTime
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455239
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:10.490875image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row941
ValueCountFrequency (%)
941 1
100.0%
2025-01-14T11:51:10.591163image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 1
33.3%
4 1
33.3%
1 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 1
33.3%
4 1
33.3%
1 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 1
33.3%
4 1
33.3%
1 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 1
33.3%
4 1
33.3%
1 1
33.3%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing78261
Missing (%)17.2%
Memory size3.5 MiB
2025-01-14T11:51:10.801421image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.769477875
Min length1

Characters and Unicode

Total characters1044035
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row84
2nd row151
3rd row130
4th row142
5th row41
ValueCountFrequency (%)
120 2836
 
0.8%
126 2410
 
0.6%
243 2274
 
0.6%
152 2002
 
0.5%
151 1999
 
0.5%
251 1989
 
0.5%
90 1984
 
0.5%
212 1868
 
0.5%
117 1868
 
0.5%
146 1854
 
0.5%
Other values (356) 355895
94.4%
2025-01-14T11:51:11.085644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 211512
20.3%
2 189879
18.2%
3 134995
12.9%
5 83694
 
8.0%
4 82733
 
7.9%
6 77281
 
7.4%
0 70636
 
6.8%
7 67025
 
6.4%
9 65179
 
6.2%
8 61101
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1044035
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 211512
20.3%
2 189879
18.2%
3 134995
12.9%
5 83694
 
8.0%
4 82733
 
7.9%
6 77281
 
7.4%
0 70636
 
6.8%
7 67025
 
6.4%
9 65179
 
6.2%
8 61101
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1044035
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 211512
20.3%
2 189879
18.2%
3 134995
12.9%
5 83694
 
8.0%
4 82733
 
7.9%
6 77281
 
7.4%
0 70636
 
6.8%
7 67025
 
6.4%
9 65179
 
6.2%
8 61101
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1044035
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 211512
20.3%
2 189879
18.2%
3 134995
12.9%
5 83694
 
8.0%
4 82733
 
7.9%
6 77281
 
7.4%
0 70636
 
6.8%
7 67025
 
6.4%
9 65179
 
6.2%
8 61101
 
5.9%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing78054
Missing (%)17.1%
Memory size3.5 MiB
2025-01-14T11:51:11.314610image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.770537613
Min length1

Characters and Unicode

Total characters1045008
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row84
2nd row151
3rd row130
4th row142
5th row41
ValueCountFrequency (%)
120 2613
 
0.7%
243 2527
 
0.7%
126 2497
 
0.7%
151 2080
 
0.6%
251 1981
 
0.5%
90 1922
 
0.5%
117 1873
 
0.5%
212 1812
 
0.5%
161 1811
 
0.5%
181 1799
 
0.5%
Other values (356) 356271
94.5%
2025-01-14T11:51:11.639857image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 211325
20.2%
2 190059
18.2%
3 135983
13.0%
5 84094
 
8.0%
4 82141
 
7.9%
6 77433
 
7.4%
0 69718
 
6.7%
7 66926
 
6.4%
9 65843
 
6.3%
8 61486
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1045008
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 211325
20.2%
2 190059
18.2%
3 135983
13.0%
5 84094
 
8.0%
4 82141
 
7.9%
6 77433
 
7.4%
0 69718
 
6.7%
7 66926
 
6.4%
9 65843
 
6.3%
8 61486
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1045008
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 211325
20.2%
2 190059
18.2%
3 135983
13.0%
5 84094
 
8.0%
4 82141
 
7.9%
6 77433
 
7.4%
0 69718
 
6.7%
7 66926
 
6.4%
9 65843
 
6.3%
8 61486
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1045008
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 211325
20.2%
2 190059
18.2%
3 135983
13.0%
5 84094
 
8.0%
4 82141
 
7.9%
6 77433
 
7.4%
0 69718
 
6.7%
7 66926
 
6.4%
9 65843
 
6.3%
8 61486
 
5.9%

year
Text

Missing 

Distinct191
Distinct (%)< 0.1%
Missing57340
Missing (%)12.6%
Memory size3.5 MiB
2025-01-14T11:51:11.860491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1591600
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st row1938
2nd row1956
3rd row1997
4th row1978
5th row1928
ValueCountFrequency (%)
1909 17715
 
4.5%
1908 13647
 
3.4%
1970 11217
 
2.8%
1969 10457
 
2.6%
1964 9091
 
2.3%
1978 8984
 
2.3%
1967 8381
 
2.1%
1979 7849
 
2.0%
1971 7834
 
2.0%
1968 7274
 
1.8%
Other values (181) 295451
74.3%
2025-01-14T11:51:12.156592image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 434234
27.3%
1 414427
26.0%
0 150681
 
9.5%
8 134134
 
8.4%
7 104317
 
6.6%
6 101149
 
6.4%
2 84018
 
5.3%
5 60071
 
3.8%
4 57613
 
3.6%
3 50956
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1591600
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 434234
27.3%
1 414427
26.0%
0 150681
 
9.5%
8 134134
 
8.4%
7 104317
 
6.6%
6 101149
 
6.4%
2 84018
 
5.3%
5 60071
 
3.8%
4 57613
 
3.6%
3 50956
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Common 1591600
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 434234
27.3%
1 414427
26.0%
0 150681
 
9.5%
8 134134
 
8.4%
7 104317
 
6.6%
6 101149
 
6.4%
2 84018
 
5.3%
5 60071
 
3.8%
4 57613
 
3.6%
3 50956
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1591600
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 434234
27.3%
1 414427
26.0%
0 150681
 
9.5%
8 134134
 
8.4%
7 104317
 
6.6%
6 101149
 
6.4%
2 84018
 
5.3%
5 60071
 
3.8%
4 57613
 
3.6%
3 50956
 
3.2%

month
Text

Missing 

Distinct15
Distinct (%)< 0.1%
Missing77299
Missing (%)17.0%
Memory size3.5 MiB
2025-01-14T11:51:12.226671image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length1
Mean length1.196401026
Min length1

Characters and Unicode

Total characters452169
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row3
2nd row5
3rd row5
4th row5
5th row2
ValueCountFrequency (%)
5 47615
12.6%
6 38303
10.1%
9 38275
10.1%
8 35803
9.5%
4 35163
9.3%
7 33904
9.0%
3 33062
8.7%
11 31787
8.4%
10 24450
6.5%
2 22872
6.1%
Other values (5) 36707
9.7%
2025-01-14T11:51:12.361586image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 124730
27.6%
5 47616
 
10.5%
2 40845
 
9.0%
6 38305
 
8.5%
9 38277
 
8.5%
8 35804
 
7.9%
4 35163
 
7.8%
7 33905
 
7.5%
3 33064
 
7.3%
0 24457
 
5.4%
Other values (3) 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 452166
> 99.9%
Uppercase Letter 2
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 124730
27.6%
5 47616
 
10.5%
2 40845
 
9.0%
6 38305
 
8.5%
9 38277
 
8.5%
8 35804
 
7.9%
4 35163
 
7.8%
7 33905
 
7.5%
3 33064
 
7.3%
0 24457
 
5.4%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
N 1
50.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 452167
> 99.9%
Latin 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 124730
27.6%
5 47616
 
10.5%
2 40845
 
9.0%
6 38305
 
8.5%
9 38277
 
8.5%
8 35804
 
7.9%
4 35163
 
7.8%
7 33905
 
7.5%
3 33064
 
7.3%
0 24457
 
5.4%
Latin
ValueCountFrequency (%)
S 1
50.0%
N 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 452169
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 124730
27.6%
5 47616
 
10.5%
2 40845
 
9.0%
6 38305
 
8.5%
9 38277
 
8.5%
8 35804
 
7.9%
4 35163
 
7.8%
7 33905
 
7.5%
3 33064
 
7.3%
0 24457
 
5.4%
Other values (3) 3
 
< 0.1%

day
Text

Missing 

Distinct34
Distinct (%)< 0.1%
Missing91020
Missing (%)20.0%
Memory size3.5 MiB
2025-01-14T11:51:12.440174image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length2
Mean length1.692839493
Min length1

Characters and Unicode

Total characters616566
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row25
2nd row30
3rd row10
4th row22
5th row10
ValueCountFrequency (%)
8 14008
 
3.8%
15 13162
 
3.6%
6 12818
 
3.5%
7 12792
 
3.5%
23 12682
 
3.5%
5 12653
 
3.5%
11 12561
 
3.4%
3 12482
 
3.4%
16 12373
 
3.4%
14 12197
 
3.3%
Other values (24) 236492
64.9%
2025-01-14T11:51:12.577916image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 162266
26.3%
2 151474
24.6%
3 52362
 
8.5%
5 37364
 
6.1%
8 36930
 
6.0%
6 36819
 
6.0%
4 36163
 
5.9%
7 35942
 
5.8%
9 33781
 
5.5%
0 33462
 
5.4%
Other values (2) 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 616563
> 99.9%
Uppercase Letter 2
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 162266
26.3%
2 151474
24.6%
3 52362
 
8.5%
5 37364
 
6.1%
8 36930
 
6.0%
6 36819
 
6.0%
4 36163
 
5.9%
7 35942
 
5.8%
9 33781
 
5.5%
0 33462
 
5.4%
Uppercase Letter
ValueCountFrequency (%)
E 2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 616564
> 99.9%
Latin 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 162266
26.3%
2 151474
24.6%
3 52362
 
8.5%
5 37364
 
6.1%
8 36930
 
6.0%
6 36819
 
6.0%
4 36163
 
5.9%
7 35942
 
5.8%
9 33781
 
5.5%
0 33462
 
5.4%
Latin
ValueCountFrequency (%)
E 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 616566
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 162266
26.3%
2 151474
24.6%
3 52362
 
8.5%
5 37364
 
6.1%
8 36930
 
6.0%
6 36819
 
6.0%
4 36163
 
5.9%
7 35942
 
5.8%
9 33781
 
5.5%
0 33462
 
5.4%
Other values (2) 3
 
< 0.1%

verbatimEventDate
Text

Missing 

Distinct34048
Distinct (%)9.4%
Missing92476
Missing (%)20.3%
Memory size3.5 MiB
2025-01-14T11:51:12.771465image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length102
Median length98
Mean length26.64552712
Min length2

Characters and Unicode

Total characters9666038
Distinct characters71
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10959 ?
Unique (%)3.0%

Sample

1st row0000 00 00 - 0000 00 00
2nd row1938 Mar 25 - 0000 00 00
3rd row0000 00 00 - 0000 00 00
4th row1956 May 30 - 0000 00 00
5th row0000 00 00 - 0000 00 00
ValueCountFrequency (%)
00 796851
29.5%
420518
15.6%
0000 373823
13.9%
may 36089
 
1.3%
jun 32032
 
1.2%
sep 31702
 
1.2%
aug 30347
 
1.1%
apr 29082
 
1.1%
jul 27008
 
1.0%
mar 26147
 
1.0%
Other values (2489) 893661
33.1%
2025-01-14T11:51:13.088448image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3534006
36.6%
2334496
24.2%
1 644269
 
6.7%
9 432319
 
4.5%
- 426040
 
4.4%
2 223250
 
2.3%
: 175115
 
1.8%
8 153828
 
1.6%
3 151323
 
1.6%
5 144572
 
1.5%
Other values (61) 1446820
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5679155
58.8%
Space Separator 2334496
24.2%
Lowercase Letter 643756
 
6.7%
Dash Punctuation 426042
 
4.4%
Uppercase Letter 314163
 
3.3%
Other Punctuation 267714
 
2.8%
Open Punctuation 326
 
< 0.1%
Close Punctuation 326
 
< 0.1%
Math Symbol 60
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 91611
14.2%
a 81381
12.6%
e 70067
10.9%
p 62177
9.7%
r 58271
9.1%
n 50117
7.8%
y 36476
 
5.7%
c 34896
 
5.4%
g 30936
 
4.8%
l 28218
 
4.4%
Other values (14) 99606
15.5%
Uppercase Letter
ValueCountFrequency (%)
J 75519
24.0%
M 63520
20.2%
A 61269
19.5%
S 31966
10.2%
N 25249
 
8.0%
F 19093
 
6.1%
O 17674
 
5.6%
D 16285
 
5.2%
H 1233
 
0.4%
T 729
 
0.2%
Other values (13) 1626
 
0.5%
Decimal Number
ValueCountFrequency (%)
0 3534006
62.2%
1 644269
 
11.3%
9 432319
 
7.6%
2 223250
 
3.9%
8 153828
 
2.7%
3 151323
 
2.7%
5 144572
 
2.5%
6 135753
 
2.4%
7 135700
 
2.4%
4 124135
 
2.2%
Other Punctuation
ValueCountFrequency (%)
: 175115
65.4%
; 89849
33.6%
, 1348
 
0.5%
. 1124
 
0.4%
/ 273
 
0.1%
? 5
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 426040
> 99.9%
2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 325
99.7%
[ 1
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 325
99.7%
] 1
 
0.3%
Space Separator
ValueCountFrequency (%)
2334496
100.0%
Math Symbol
ValueCountFrequency (%)
+ 60
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8708119
90.1%
Latin 957919
 
9.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 91611
 
9.6%
a 81381
 
8.5%
J 75519
 
7.9%
e 70067
 
7.3%
M 63520
 
6.6%
p 62177
 
6.5%
A 61269
 
6.4%
r 58271
 
6.1%
n 50117
 
5.2%
y 36476
 
3.8%
Other values (37) 307511
32.1%
Common
ValueCountFrequency (%)
0 3534006
40.6%
2334496
26.8%
1 644269
 
7.4%
9 432319
 
5.0%
- 426040
 
4.9%
2 223250
 
2.6%
: 175115
 
2.0%
8 153828
 
1.8%
3 151323
 
1.7%
5 144572
 
1.7%
Other values (14) 488901
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9666036
> 99.9%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3534006
36.6%
2334496
24.2%
1 644269
 
6.7%
9 432319
 
4.5%
- 426040
 
4.4%
2 223250
 
2.3%
: 175115
 
1.8%
8 153828
 
1.6%
3 151323
 
1.6%
5 144572
 
1.5%
Other values (60) 1446818
15.0%
Punctuation
ValueCountFrequency (%)
2
100.0%

eventRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455239
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:13.166228image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length96
Median length96
Mean length96
Min length96

Characters and Unicode

Total characters96
Distinct characters33
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGuide to Best Practices for Georeferencing. (Chapman and Wieczorek, eds. 2006). Google Earth Pro
ValueCountFrequency (%)
guide 1
 
7.1%
to 1
 
7.1%
best 1
 
7.1%
practices 1
 
7.1%
for 1
 
7.1%
georeferencing 1
 
7.1%
chapman 1
 
7.1%
and 1
 
7.1%
wieczorek 1
 
7.1%
eds 1
 
7.1%
Other values (4) 4
28.6%
2025-01-14T11:51:13.291897image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13
 
13.5%
e 11
 
11.5%
o 7
 
7.3%
r 7
 
7.3%
a 5
 
5.2%
n 4
 
4.2%
i 4
 
4.2%
t 4
 
4.2%
c 4
 
4.2%
G 3
 
3.1%
Other values (23) 34
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 64
66.7%
Space Separator 13
 
13.5%
Uppercase Letter 9
 
9.4%
Other Punctuation 4
 
4.2%
Decimal Number 4
 
4.2%
Close Punctuation 1
 
1.0%
Open Punctuation 1
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11
17.2%
o 7
10.9%
r 7
10.9%
a 5
7.8%
n 4
 
6.2%
i 4
 
6.2%
t 4
 
6.2%
c 4
 
6.2%
d 3
 
4.7%
s 3
 
4.7%
Other values (9) 12
18.8%
Uppercase Letter
ValueCountFrequency (%)
G 3
33.3%
P 2
22.2%
B 1
 
11.1%
W 1
 
11.1%
C 1
 
11.1%
E 1
 
11.1%
Decimal Number
ValueCountFrequency (%)
0 2
50.0%
6 1
25.0%
2 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 3
75.0%
, 1
 
25.0%
Space Separator
ValueCountFrequency (%)
13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 73
76.0%
Common 23
 
24.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11
15.1%
o 7
 
9.6%
r 7
 
9.6%
a 5
 
6.8%
n 4
 
5.5%
i 4
 
5.5%
t 4
 
5.5%
c 4
 
5.5%
G 3
 
4.1%
d 3
 
4.1%
Other values (15) 21
28.8%
Common
ValueCountFrequency (%)
13
56.5%
. 3
 
13.0%
0 2
 
8.7%
, 1
 
4.3%
) 1
 
4.3%
6 1
 
4.3%
2 1
 
4.3%
( 1
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 96
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
13
 
13.5%
e 11
 
11.5%
o 7
 
7.3%
r 7
 
7.3%
a 5
 
5.2%
n 4
 
4.2%
i 4
 
4.2%
t 4
 
4.2%
c 4
 
4.2%
G 3
 
3.1%
Other values (23) 34
35.4%

locationID
Text

Missing 

Distinct16404
Distinct (%)15.9%
Missing352031
Missing (%)77.3%
Memory size3.5 MiB
2025-01-14T11:51:13.479089image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length68
Median length40
Mean length5.14473544
Min length1

Characters and Unicode

Total characters530983
Distinct characters75
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6088 ?
Unique (%)5.9%

Sample

1st rowM10-97B4 (4
2nd row4-31N
3rd row5627
4th row308
5th rowB12 TR4
ValueCountFrequency (%)
d 13063
 
9.8%
tc 3543
 
2.7%
haul 1244
 
0.9%
trans 1038
 
0.8%
1 918
 
0.7%
2 894
 
0.7%
tt 799
 
0.6%
4 661
 
0.5%
3 656
 
0.5%
5 629
 
0.5%
Other values (13796) 109258
82.3%
2025-01-14T11:51:13.768547image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 61873
 
11.7%
2 49296
 
9.3%
- 37683
 
7.1%
3 36374
 
6.9%
4 36203
 
6.8%
5 34154
 
6.4%
0 31690
 
6.0%
29494
 
5.6%
7 29224
 
5.5%
6 27488
 
5.2%
Other values (65) 157504
29.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 348723
65.7%
Uppercase Letter 95285
 
17.9%
Dash Punctuation 37683
 
7.1%
Space Separator 29494
 
5.6%
Other Punctuation 11084
 
2.1%
Lowercase Letter 7547
 
1.4%
Open Punctuation 805
 
0.2%
Close Punctuation 350
 
0.1%
Math Symbol 12
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 15186
15.9%
A 11061
11.6%
T 9958
10.5%
C 7283
 
7.6%
N 6625
 
7.0%
S 5565
 
5.8%
M 4811
 
5.0%
B 4690
 
4.9%
E 4554
 
4.8%
P 4398
 
4.6%
Other values (16) 21154
22.2%
Lowercase Letter
ValueCountFrequency (%)
u 1395
18.5%
a 1021
13.5%
r 854
11.3%
i 678
9.0%
m 582
7.7%
q 576
7.6%
o 499
 
6.6%
n 332
 
4.4%
t 315
 
4.2%
e 236
 
3.1%
Other values (13) 1059
14.0%
Decimal Number
ValueCountFrequency (%)
1 61873
17.7%
2 49296
14.1%
3 36374
10.4%
4 36203
10.4%
5 34154
9.8%
0 31690
9.1%
7 29224
8.4%
6 27488
7.9%
8 21652
 
6.2%
9 20769
 
6.0%
Other Punctuation
ValueCountFrequency (%)
. 7464
67.3%
; 1088
 
9.8%
/ 1054
 
9.5%
, 808
 
7.3%
& 269
 
2.4%
? 185
 
1.7%
# 112
 
1.0%
: 101
 
0.9%
' 2
 
< 0.1%
" 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 9
75.0%
= 3
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 37683
100.0%
Space Separator
ValueCountFrequency (%)
29494
100.0%
Open Punctuation
ValueCountFrequency (%)
( 805
100.0%
Close Punctuation
ValueCountFrequency (%)
) 350
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 428151
80.6%
Latin 102832
 
19.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 15186
14.8%
A 11061
 
10.8%
T 9958
 
9.7%
C 7283
 
7.1%
N 6625
 
6.4%
S 5565
 
5.4%
M 4811
 
4.7%
B 4690
 
4.6%
E 4554
 
4.4%
P 4398
 
4.3%
Other values (39) 28701
27.9%
Common
ValueCountFrequency (%)
1 61873
14.5%
2 49296
11.5%
- 37683
8.8%
3 36374
8.5%
4 36203
8.5%
5 34154
8.0%
0 31690
7.4%
29494
6.9%
7 29224
6.8%
6 27488
6.4%
Other values (16) 54672
12.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 530983
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 61873
 
11.7%
2 49296
 
9.3%
- 37683
 
7.1%
3 36374
 
6.9%
4 36203
 
6.8%
5 34154
 
6.4%
0 31690
 
6.0%
29494
 
5.6%
7 29224
 
5.5%
6 27488
 
5.2%
Other values (65) 157504
29.7%

higherGeography
Text

Missing 

Distinct13756
Distinct (%)3.2%
Missing20496
Missing (%)4.5%
Memory size3.5 MiB
2025-01-14T11:51:13.994868image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length177
Median length131
Mean length59.33825654
Min length4

Characters and Unicode

Total characters25796951
Distinct characters123
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3842 ?
Unique (%)0.9%

Sample

1st rowNorth Pacific Ocean, United States, Hawaii, Hawaiian Islands
2nd rowNorth Atlantic Ocean, Gulf of Mexico, United States, Florida, Hillsborough County
3rd rowNorth Pacific Ocean, Japan, Tokyo Prefecture, Japanese Archipelago, Honshu
4th rowNorth America, United States, West Virginia, Randolph County
5th rowAtlantic, Caribbean Sea, Barbados, Lesser Antilles, Barbados
ValueCountFrequency (%)
ocean 297645
 
8.7%
north 281576
 
8.2%
pacific 178094
 
5.2%
united 125561
 
3.7%
states 125318
 
3.7%
islands 124604
 
3.6%
atlantic 113805
 
3.3%
south 106962
 
3.1%
america 96818
 
2.8%
county 72816
 
2.1%
Other values (6594) 1896781
55.5%
2025-01-14T11:51:14.321043image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2985236
 
11.6%
a 2677799
 
10.4%
i 1842628
 
7.1%
n 1669084
 
6.5%
e 1623498
 
6.3%
t 1399002
 
5.4%
, 1343029
 
5.2%
o 1188564
 
4.6%
c 1128718
 
4.4%
r 1085860
 
4.2%
Other values (113) 8853533
34.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18063986
70.0%
Uppercase Letter 3374068
 
13.1%
Space Separator 2985236
 
11.6%
Other Punctuation 1353342
 
5.2%
Open Punctuation 6856
 
< 0.1%
Close Punctuation 6856
 
< 0.1%
Dash Punctuation 6542
 
< 0.1%
Format 30
 
< 0.1%
Decimal Number 28
 
< 0.1%
Modifier Letter 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2677799
14.8%
i 1842628
10.2%
n 1669084
9.2%
e 1623498
9.0%
t 1399002
 
7.7%
o 1188564
 
6.6%
c 1128718
 
6.2%
r 1085860
 
6.0%
l 929644
 
5.1%
s 877902
 
4.9%
Other values (57) 3641287
20.2%
Uppercase Letter
ValueCountFrequency (%)
S 417875
12.4%
P 393243
11.7%
A 382245
11.3%
N 334400
9.9%
O 326060
9.7%
C 239998
7.1%
I 236833
 
7.0%
M 157031
 
4.7%
B 155810
 
4.6%
U 132241
 
3.9%
Other values (26) 598332
17.7%
Other Punctuation
ValueCountFrequency (%)
, 1343029
99.2%
' 5378
 
0.4%
. 3637
 
0.3%
; 1110
 
0.1%
/ 108
 
< 0.1%
: 80
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 13
46.4%
0 7
25.0%
3 4
 
14.3%
1 3
 
10.7%
4 1
 
3.6%
Dash Punctuation
ValueCountFrequency (%)
- 6358
97.2%
184
 
2.8%
Open Punctuation
ValueCountFrequency (%)
( 5691
83.0%
[ 1165
 
17.0%
Close Punctuation
ValueCountFrequency (%)
) 5691
83.0%
] 1165
 
17.0%
Space Separator
ValueCountFrequency (%)
2985236
100.0%
Format
ValueCountFrequency (%)
30
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21438054
83.1%
Common 4358897
 
16.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2677799
 
12.5%
i 1842628
 
8.6%
n 1669084
 
7.8%
e 1623498
 
7.6%
t 1399002
 
6.5%
o 1188564
 
5.5%
c 1128718
 
5.3%
r 1085860
 
5.1%
l 929644
 
4.3%
s 877902
 
4.1%
Other values (93) 7015355
32.7%
Common
ValueCountFrequency (%)
2985236
68.5%
, 1343029
30.8%
- 6358
 
0.1%
( 5691
 
0.1%
) 5691
 
0.1%
' 5378
 
0.1%
. 3637
 
0.1%
] 1165
 
< 0.1%
[ 1165
 
< 0.1%
; 1110
 
< 0.1%
Other values (10) 437
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25788944
> 99.9%
None 7729
 
< 0.1%
Punctuation 214
 
< 0.1%
Latin Ext Additional 57
 
< 0.1%
Modifier Letters 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2985236
 
11.6%
a 2677799
 
10.4%
i 1842628
 
7.1%
n 1669084
 
6.5%
e 1623498
 
6.3%
t 1399002
 
5.4%
, 1343029
 
5.2%
o 1188564
 
4.6%
c 1128718
 
4.4%
r 1085860
 
4.2%
Other values (59) 8845526
34.3%
None
ValueCountFrequency (%)
ó 2696
34.9%
á 2628
34.0%
í 1139
14.7%
é 385
 
5.0%
ñ 174
 
2.3%
ã 109
 
1.4%
ú 107
 
1.4%
Ō 78
 
1.0%
Î 59
 
0.8%
Ø 51
 
0.7%
Other values (31) 303
 
3.9%
Punctuation
ValueCountFrequency (%)
184
86.0%
30
 
14.0%
Latin Ext Additional
ValueCountFrequency (%)
15
26.3%
11
19.3%
ế 6
 
10.5%
6
 
10.5%
5
 
8.8%
4
 
7.0%
4
 
7.0%
4
 
7.0%
1
 
1.8%
1
 
1.8%
Modifier Letters
ValueCountFrequency (%)
ʻ 7
100.0%

continent
Text

Missing 

Distinct70
Distinct (%)< 0.1%
Missing23626
Missing (%)5.2%
Memory size3.5 MiB
2025-01-14T11:51:14.390311image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length51
Median length43
Mean length16.40052686
Min length4

Characters and Unicode

Total characters7078697
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowNorth Pacific Ocean
2nd rowNorth Atlantic Ocean
3rd rowNorth Pacific Ocean
4th rowNorth America
5th rowAtlantic
ValueCountFrequency (%)
ocean 296332
27.0%
north 270855
24.7%
pacific 178081
16.2%
atlantic 113708
 
10.4%
america 96818
 
8.8%
south 90783
 
8.3%
indian 28799
 
2.6%
asia 11363
 
1.0%
africa 6361
 
0.6%
europe 1535
 
0.1%
Other values (9) 2293
 
0.2%
2025-01-14T11:51:14.514381image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 870637
12.3%
a 733475
10.4%
665314
9.4%
i 614730
 
8.7%
t 590869
 
8.3%
n 468652
 
6.6%
e 395694
 
5.6%
r 377529
 
5.3%
o 363997
 
5.1%
h 362318
 
5.1%
Other values (17) 1635482
23.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5309592
75.0%
Uppercase Letter 1096736
 
15.5%
Space Separator 665314
 
9.4%
Other Punctuation 7055
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 870637
16.4%
a 733475
13.8%
i 614730
11.6%
t 590869
11.1%
n 468652
8.8%
e 395694
7.5%
r 377529
7.1%
o 363997
6.9%
h 362318
6.8%
f 184444
 
3.5%
Other values (6) 347247
 
6.5%
Uppercase Letter
ValueCountFrequency (%)
O 296661
27.0%
N 270856
24.7%
A 229386
20.9%
P 178083
16.2%
S 91462
 
8.3%
I 28753
 
2.6%
E 1535
 
0.1%
Other Punctuation
ValueCountFrequency (%)
, 7052
> 99.9%
/ 2
 
< 0.1%
. 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
665314
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6406328
90.5%
Common 672369
 
9.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 870637
13.6%
a 733475
11.4%
i 614730
9.6%
t 590869
9.2%
n 468652
 
7.3%
e 395694
 
6.2%
r 377529
 
5.9%
o 363997
 
5.7%
h 362318
 
5.7%
O 296661
 
4.6%
Other values (13) 1331766
20.8%
Common
ValueCountFrequency (%)
665314
99.0%
, 7052
 
1.0%
/ 2
 
< 0.1%
. 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7078697
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 870637
12.3%
a 733475
10.4%
665314
9.4%
i 614730
 
8.7%
t 590869
 
8.3%
n 468652
 
6.6%
e 395694
 
5.6%
r 377529
 
5.3%
o 363997
 
5.1%
h 362318
 
5.1%
Other values (17) 1635482
23.1%

waterBody
Text

Missing 

Distinct1777
Distinct (%)0.6%
Missing133285
Missing (%)29.3%
Memory size3.5 MiB
2025-01-14T11:51:14.698575image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length72
Median length71
Mean length24.05961082
Min length6

Characters and Unicode

Total characters7746112
Distinct characters68
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique490 ?
Unique (%)0.2%

Sample

1st rowNorth Pacific Ocean
2nd rowNorth Atlantic Ocean, Gulf of Mexico
3rd rowNorth Pacific Ocean
4th rowAtlantic, Caribbean Sea
5th rowNorth Pacific Ocean
ValueCountFrequency (%)
ocean 296332
24.5%
north 200707
16.6%
pacific 178081
14.7%
atlantic 113708
 
9.4%
south 68068
 
5.6%
sea 63587
 
5.3%
of 34823
 
2.9%
gulf 34751
 
2.9%
bay 30113
 
2.5%
indian 28800
 
2.4%
Other values (1367) 159141
13.2%
2025-01-14T11:51:14.962879image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
886156
11.4%
a 885868
11.4%
c 796064
 
10.3%
i 598577
 
7.7%
n 555392
 
7.2%
t 522493
 
6.7%
e 465189
 
6.0%
o 359396
 
4.6%
O 297253
 
3.8%
h 288206
 
3.7%
Other values (58) 2091518
27.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5546613
71.6%
Uppercase Letter 1172334
 
15.1%
Space Separator 886156
 
11.4%
Other Punctuation 139273
 
1.8%
Dash Punctuation 1592
 
< 0.1%
Open Punctuation 72
 
< 0.1%
Close Punctuation 72
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 885868
16.0%
c 796064
14.4%
i 598577
10.8%
n 555392
10.0%
t 522493
9.4%
e 465189
8.4%
o 359396
6.5%
h 288206
 
5.2%
r 272752
 
4.9%
f 249497
 
4.5%
Other values (22) 553179
10.0%
Uppercase Letter
ValueCountFrequency (%)
O 297253
25.4%
N 202077
17.2%
P 187420
16.0%
S 148094
12.6%
A 121008
10.3%
C 45570
 
3.9%
B 39830
 
3.4%
G 37598
 
3.2%
M 31550
 
2.7%
I 31014
 
2.6%
Other values (16) 30920
 
2.6%
Other Punctuation
ValueCountFrequency (%)
, 137360
98.6%
; 1110
 
0.8%
' 517
 
0.4%
. 151
 
0.1%
: 80
 
0.1%
/ 55
 
< 0.1%
Space Separator
ValueCountFrequency (%)
886156
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1592
100.0%
Open Punctuation
ValueCountFrequency (%)
( 72
100.0%
Close Punctuation
ValueCountFrequency (%)
) 72
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6718947
86.7%
Common 1027165
 
13.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 885868
13.2%
c 796064
11.8%
i 598577
 
8.9%
n 555392
 
8.3%
t 522493
 
7.8%
e 465189
 
6.9%
o 359396
 
5.3%
O 297253
 
4.4%
h 288206
 
4.3%
r 272752
 
4.1%
Other values (48) 1677757
25.0%
Common
ValueCountFrequency (%)
886156
86.3%
, 137360
 
13.4%
- 1592
 
0.2%
; 1110
 
0.1%
' 517
 
0.1%
. 151
 
< 0.1%
: 80
 
< 0.1%
( 72
 
< 0.1%
) 72
 
< 0.1%
/ 55
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7745666
> 99.9%
None 446
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
886156
11.4%
a 885868
11.4%
c 796064
 
10.3%
i 598577
 
7.7%
n 555392
 
7.2%
t 522493
 
6.7%
e 465189
 
6.0%
o 359396
 
4.6%
O 297253
 
3.8%
h 288206
 
3.7%
Other values (51) 2091072
27.0%
None
ValueCountFrequency (%)
í 171
38.3%
á 95
21.3%
ñ 68
 
15.2%
é 59
 
13.2%
ó 38
 
8.5%
è 13
 
2.9%
É 2
 
0.4%

islandGroup
Text

Missing 

Distinct323
Distinct (%)0.5%
Missing390836
Missing (%)85.9%
Memory size3.5 MiB
2025-01-14T11:51:15.189145image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length32
Mean length14.81491833
Min length4

Characters and Unicode

Total characters954140
Distinct characters64
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)0.1%

Sample

1st rowFlorida Islands
2nd rowVava'u Group
3rd rowVisayas
4th rowCuyo Islands
5th rowHa'apai Group
ValueCountFrequency (%)
islands 31620
22.3%
group 13997
 
9.9%
chain 5485
 
3.9%
visayas 4942
 
3.5%
leeward 4825
 
3.4%
ralik 4613
 
3.3%
bahama 2867
 
2.0%
island 2805
 
2.0%
cruz 2205
 
1.6%
santa 2205
 
1.6%
Other values (354) 66281
46.7%
2025-01-14T11:51:15.509319image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 143534
15.0%
s 92371
 
9.7%
77441
 
8.1%
n 71042
 
7.4%
l 57393
 
6.0%
d 51471
 
5.4%
r 46475
 
4.9%
u 38565
 
4.0%
o 37606
 
3.9%
i 37529
 
3.9%
Other values (54) 300713
31.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 728696
76.4%
Uppercase Letter 142918
 
15.0%
Space Separator 77441
 
8.1%
Open Punctuation 1946
 
0.2%
Close Punctuation 1946
 
0.2%
Other Punctuation 1144
 
0.1%
Format 30
 
< 0.1%
Dash Punctuation 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 143534
19.7%
s 92371
12.7%
n 71042
9.7%
l 57393
 
7.9%
d 51471
 
7.1%
r 46475
 
6.4%
u 38565
 
5.3%
o 37606
 
5.2%
i 37529
 
5.2%
e 33862
 
4.6%
Other values (20) 118848
16.3%
Uppercase Letter
ValueCountFrequency (%)
I 34891
24.4%
C 16634
11.6%
G 16301
11.4%
B 11325
 
7.9%
S 9420
 
6.6%
L 8924
 
6.2%
R 8182
 
5.7%
V 7087
 
5.0%
T 5922
 
4.1%
A 5185
 
3.6%
Other values (16) 19047
13.3%
Open Punctuation
ValueCountFrequency (%)
( 1764
90.6%
[ 182
 
9.4%
Close Punctuation
ValueCountFrequency (%)
) 1764
90.6%
] 182
 
9.4%
Space Separator
ValueCountFrequency (%)
77441
100.0%
Other Punctuation
ValueCountFrequency (%)
' 1144
100.0%
Format
ValueCountFrequency (%)
30
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 871614
91.4%
Common 82526
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 143534
16.5%
s 92371
 
10.6%
n 71042
 
8.2%
l 57393
 
6.6%
d 51471
 
5.9%
r 46475
 
5.3%
u 38565
 
4.4%
o 37606
 
4.3%
i 37529
 
4.3%
I 34891
 
4.0%
Other values (46) 260737
29.9%
Common
ValueCountFrequency (%)
77441
93.8%
( 1764
 
2.1%
) 1764
 
2.1%
' 1144
 
1.4%
[ 182
 
0.2%
] 182
 
0.2%
30
 
< 0.1%
- 19
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 954000
> 99.9%
None 110
 
< 0.1%
Punctuation 30
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 143534
15.0%
s 92371
 
9.7%
77441
 
8.1%
n 71042
 
7.4%
l 57393
 
6.0%
d 51471
 
5.4%
r 46475
 
4.9%
u 38565
 
4.0%
o 37606
 
3.9%
i 37529
 
3.9%
Other values (48) 300573
31.5%
None
ValueCountFrequency (%)
Ō 78
70.9%
ñ 18
 
16.4%
ù 5
 
4.5%
à 5
 
4.5%
á 4
 
3.6%
Punctuation
ValueCountFrequency (%)
30
100.0%

island
Text

Missing 

Distinct2224
Distinct (%)1.2%
Missing270616
Missing (%)59.4%
Memory size3.5 MiB
2025-01-14T11:51:15.813919image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length37
Mean length9.782482234
Min length3

Characters and Unicode

Total characters1806081
Distinct characters80
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique463 ?
Unique (%)0.3%

Sample

1st rowHonshu
2nd rowBarbados
3rd rowPutic Island
4th rowGuam
5th rowFlorida Island
ValueCountFrequency (%)
island 45624
 
15.8%
bermuda 14507
 
5.0%
atoll 13109
 
4.5%
luzon 7632
 
2.6%
oahu 6793
 
2.3%
cay 5201
 
1.8%
carrie 3799
 
1.3%
bow 3799
 
1.3%
new 3013
 
1.0%
cuba 2705
 
0.9%
Other values (2080) 182980
63.3%
2025-01-14T11:51:16.187321image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 268869
14.9%
n 133972
 
7.4%
o 116671
 
6.5%
l 110968
 
6.1%
104538
 
5.8%
u 90798
 
5.0%
e 88610
 
4.9%
i 86649
 
4.8%
r 86279
 
4.8%
d 84325
 
4.7%
Other values (70) 634402
35.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1403221
77.7%
Uppercase Letter 287712
 
15.9%
Space Separator 104538
 
5.8%
Open Punctuation 3752
 
0.2%
Close Punctuation 3752
 
0.2%
Other Punctuation 1868
 
0.1%
Dash Punctuation 1229
 
0.1%
Decimal Number 8
 
< 0.1%
Modifier Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 268869
19.2%
n 133972
9.5%
o 116671
8.3%
l 110968
7.9%
u 90798
 
6.5%
e 88610
 
6.3%
i 86649
 
6.2%
r 86279
 
6.1%
d 84325
 
6.0%
s 83768
 
6.0%
Other values (30) 252312
18.0%
Uppercase Letter
ValueCountFrequency (%)
I 50635
17.6%
B 35471
12.3%
C 23026
 
8.0%
M 22484
 
7.8%
A 21386
 
7.4%
S 16305
 
5.7%
T 14185
 
4.9%
L 13183
 
4.6%
O 10785
 
3.7%
N 10606
 
3.7%
Other values (17) 69646
24.2%
Other Punctuation
ValueCountFrequency (%)
' 1251
67.0%
. 551
29.5%
/ 39
 
2.1%
, 27
 
1.4%
Open Punctuation
ValueCountFrequency (%)
( 2827
75.3%
[ 925
 
24.7%
Close Punctuation
ValueCountFrequency (%)
) 2827
75.3%
] 925
 
24.7%
Decimal Number
ValueCountFrequency (%)
3 4
50.0%
0 4
50.0%
Space Separator
ValueCountFrequency (%)
104538
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1229
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1690933
93.6%
Common 115148
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 268869
15.9%
n 133972
 
7.9%
o 116671
 
6.9%
l 110968
 
6.6%
u 90798
 
5.4%
e 88610
 
5.2%
i 86649
 
5.1%
r 86279
 
5.1%
d 84325
 
5.0%
s 83768
 
5.0%
Other values (57) 540024
31.9%
Common
ValueCountFrequency (%)
104538
90.8%
( 2827
 
2.5%
) 2827
 
2.5%
' 1251
 
1.1%
- 1229
 
1.1%
[ 925
 
0.8%
] 925
 
0.8%
. 551
 
0.5%
/ 39
 
< 0.1%
, 27
 
< 0.1%
Other values (3) 9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1805576
> 99.9%
None 488
 
< 0.1%
Latin Ext Additional 16
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 268869
14.9%
n 133972
 
7.4%
o 116671
 
6.5%
l 110968
 
6.1%
104538
 
5.8%
u 90798
 
5.0%
e 88610
 
4.9%
i 86649
 
4.8%
r 86279
 
4.8%
d 84325
 
4.7%
Other values (53) 633897
35.1%
None
ValueCountFrequency (%)
ó 101
20.7%
é 91
18.6%
á 85
17.4%
ñ 65
13.3%
Î 50
10.2%
ú 41
8.4%
í 17
 
3.5%
Á 12
 
2.5%
â 10
 
2.0%
ô 7
 
1.4%
Other values (4) 9
 
1.8%
Latin Ext Additional
ValueCountFrequency (%)
15
93.8%
1
 
6.2%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

country
Text

Missing 

Distinct266
Distinct (%)0.1%
Missing35901
Missing (%)7.9%
Memory size3.5 MiB
2025-01-14T11:51:16.424181image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44
Median length32
Mean length10.33857333
Min length4

Characters and Unicode

Total characters4335367
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)< 0.1%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowJapan
4th rowUnited States
5th rowBarbados
ValueCountFrequency (%)
united 123991
19.9%
states 123749
19.9%
philippines 46134
 
7.4%
bermuda 15778
 
2.5%
islands 15706
 
2.5%
indonesia 12767
 
2.0%
brazil 11589
 
1.9%
french 10754
 
1.7%
new 10627
 
1.7%
panama 10452
 
1.7%
Other values (286) 241779
38.8%
2025-01-14T11:51:16.750611image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 492047
11.3%
e 448033
 
10.3%
i 431908
 
10.0%
t 414009
 
9.5%
n 350125
 
8.1%
s 273998
 
6.3%
203987
 
4.7%
d 203110
 
4.7%
l 161645
 
3.7%
S 148976
 
3.4%
Other values (49) 1207529
27.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3511733
81.0%
Uppercase Letter 616400
 
14.2%
Space Separator 203987
 
4.7%
Other Punctuation 1861
 
< 0.1%
Close Punctuation 628
 
< 0.1%
Open Punctuation 628
 
< 0.1%
Dash Punctuation 130
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 492047
14.0%
e 448033
12.8%
i 431908
12.3%
t 414009
11.8%
n 350125
10.0%
s 273998
7.8%
d 203110
 
5.8%
l 161645
 
4.6%
p 112903
 
3.2%
r 111771
 
3.2%
Other values (16) 512184
14.6%
Uppercase Letter
ValueCountFrequency (%)
S 148976
24.2%
U 124934
20.3%
P 83591
13.6%
B 42323
 
6.9%
I 35277
 
5.7%
M 27210
 
4.4%
C 23354
 
3.8%
T 19288
 
3.1%
F 17674
 
2.9%
G 16572
 
2.7%
Other values (15) 77201
12.5%
Other Punctuation
ValueCountFrequency (%)
. 1384
74.4%
' 261
 
14.0%
, 204
 
11.0%
/ 12
 
0.6%
Space Separator
ValueCountFrequency (%)
203987
100.0%
Close Punctuation
ValueCountFrequency (%)
) 628
100.0%
Open Punctuation
ValueCountFrequency (%)
( 628
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 130
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4128133
95.2%
Common 207234
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 492047
11.9%
e 448033
10.9%
i 431908
10.5%
t 414009
 
10.0%
n 350125
 
8.5%
s 273998
 
6.6%
d 203110
 
4.9%
l 161645
 
3.9%
S 148976
 
3.6%
U 124934
 
3.0%
Other values (41) 1079348
26.1%
Common
ValueCountFrequency (%)
203987
98.4%
. 1384
 
0.7%
) 628
 
0.3%
( 628
 
0.3%
' 261
 
0.1%
, 204
 
0.1%
- 130
 
0.1%
/ 12
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4335367
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 492047
11.3%
e 448033
 
10.3%
i 431908
 
10.0%
t 414009
 
9.5%
n 350125
 
8.1%
s 273998
 
6.3%
203987
 
4.7%
d 203110
 
4.7%
l 161645
 
3.7%
S 148976
 
3.4%
Other values (49) 1207529
27.9%

stateProvince
Text

Missing 

Distinct1486
Distinct (%)0.5%
Missing174317
Missing (%)38.3%
Memory size3.5 MiB
2025-01-14T11:51:16.967653image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length48
Median length36
Mean length11.08349619
Min length3

Characters and Unicode

Total characters3113609
Distinct characters97
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique252 ?
Unique (%)0.1%

Sample

1st rowHawaii
2nd rowFlorida
3rd rowTokyo Prefecture
4th rowWest Virginia
5th rowPalawan
ValueCountFrequency (%)
province 30908
 
7.1%
florida 17201
 
4.0%
carolina 12506
 
2.9%
virginia 11495
 
2.7%
hawaii 10719
 
2.5%
north 9677
 
2.2%
region 9363
 
2.2%
south 8306
 
1.9%
maryland 7691
 
1.8%
islands 6712
 
1.6%
Other values (1479) 308437
71.2%
2025-01-14T11:51:17.306952image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 411178
13.2%
i 266028
 
8.5%
n 234765
 
7.5%
o 224208
 
7.2%
e 216324
 
6.9%
r 215096
 
6.9%
152092
 
4.9%
s 136933
 
4.4%
t 132533
 
4.3%
l 123802
 
4.0%
Other values (87) 1000650
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2526911
81.2%
Uppercase Letter 429926
 
13.8%
Space Separator 152092
 
4.9%
Dash Punctuation 2749
 
0.1%
Other Punctuation 1809
 
0.1%
Open Punctuation 58
 
< 0.1%
Close Punctuation 58
 
< 0.1%
Decimal Number 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 411178
16.3%
i 266028
10.5%
n 234765
9.3%
o 224208
8.9%
e 216324
8.6%
r 215096
8.5%
s 136933
 
5.4%
t 132533
 
5.2%
l 123802
 
4.9%
u 87850
 
3.5%
Other values (44) 478194
18.9%
Uppercase Letter
ValueCountFrequency (%)
P 57746
13.4%
M 38632
 
9.0%
S 38478
 
8.9%
C 36519
 
8.5%
N 28116
 
6.5%
T 23306
 
5.4%
A 22915
 
5.3%
D 18318
 
4.3%
F 18140
 
4.2%
R 16359
 
3.8%
Other values (22) 131397
30.6%
Other Punctuation
ValueCountFrequency (%)
' 907
50.1%
. 898
49.6%
, 2
 
0.1%
/ 2
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 2719
98.9%
30
 
1.1%
Decimal Number
ValueCountFrequency (%)
0 3
50.0%
1 3
50.0%
Space Separator
ValueCountFrequency (%)
152092
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 58
100.0%
Close Punctuation
ValueCountFrequency (%)
] 58
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2956837
95.0%
Common 156772
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 411178
13.9%
i 266028
 
9.0%
n 234765
 
7.9%
o 224208
 
7.6%
e 216324
 
7.3%
r 215096
 
7.3%
s 136933
 
4.6%
t 132533
 
4.5%
l 123802
 
4.2%
u 87850
 
3.0%
Other values (76) 908120
30.7%
Common
ValueCountFrequency (%)
152092
97.0%
- 2719
 
1.7%
' 907
 
0.6%
. 898
 
0.6%
[ 58
 
< 0.1%
] 58
 
< 0.1%
30
 
< 0.1%
0 3
 
< 0.1%
1 3
 
< 0.1%
, 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3109091
99.9%
None 4465
 
0.1%
Punctuation 30
 
< 0.1%
Latin Ext Additional 23
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 411178
13.2%
i 266028
 
8.6%
n 234765
 
7.6%
o 224208
 
7.2%
e 216324
 
7.0%
r 215096
 
6.9%
152092
 
4.9%
s 136933
 
4.4%
t 132533
 
4.3%
l 123802
 
4.0%
Other values (52) 996132
32.0%
None
ValueCountFrequency (%)
á 1932
43.3%
ó 1585
35.5%
í 533
 
11.9%
é 135
 
3.0%
ã 109
 
2.4%
ê 36
 
0.8%
Á 27
 
0.6%
è 20
 
0.4%
å 11
 
0.2%
É 10
 
0.2%
Other values (19) 67
 
1.5%
Punctuation
ValueCountFrequency (%)
30
100.0%
Latin Ext Additional
ValueCountFrequency (%)
5
21.7%
ế 5
21.7%
5
21.7%
4
17.4%
4
17.4%

county
Text

Missing 

Distinct2317
Distinct (%)2.4%
Missing357556
Missing (%)78.5%
Memory size3.5 MiB
2025-01-14T11:51:17.544917image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length46
Median length40
Mean length14.85283158
Min length3

Characters and Unicode

Total characters1450884
Distinct characters87
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique417 ?
Unique (%)0.4%

Sample

1st rowHillsborough County
2nd rowRandolph County
3rd rowThoothukudi District
4th rowCalvert County
5th rowNew Hanover County
ValueCountFrequency (%)
county 71648
34.9%
district 9128
 
4.4%
honolulu 5828
 
2.8%
monroe 3066
 
1.5%
parish 2197
 
1.1%
carteret 1943
 
0.9%
borough 1790
 
0.9%
san 1543
 
0.8%
montgomery 1350
 
0.7%
barnstable 1256
 
0.6%
Other values (2386) 105473
51.4%
2025-01-14T11:51:17.867002image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 144011
 
9.9%
o 142674
 
9.8%
t 126041
 
8.7%
u 110455
 
7.6%
107538
 
7.4%
a 92090
 
6.3%
C 86849
 
6.0%
y 83562
 
5.8%
e 75888
 
5.2%
r 67047
 
4.6%
Other values (77) 414729
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1134750
78.2%
Uppercase Letter 205700
 
14.2%
Space Separator 107538
 
7.4%
Other Punctuation 2009
 
0.1%
Dash Punctuation 823
 
0.1%
Open Punctuation 22
 
< 0.1%
Close Punctuation 22
 
< 0.1%
Decimal Number 14
 
< 0.1%
Modifier Letter 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 144011
12.7%
o 142674
12.6%
t 126041
11.1%
u 110455
9.7%
a 92090
8.1%
y 83562
7.4%
e 75888
6.7%
r 67047
 
5.9%
i 59192
 
5.2%
l 50108
 
4.4%
Other values (37) 183682
16.2%
Uppercase Letter
ValueCountFrequency (%)
C 86849
42.2%
M 15548
 
7.6%
D 12613
 
6.1%
H 9959
 
4.8%
B 9681
 
4.7%
P 9352
 
4.5%
S 8796
 
4.3%
A 7470
 
3.6%
L 6042
 
2.9%
W 5571
 
2.7%
Other values (19) 33819
 
16.4%
Other Punctuation
ValueCountFrequency (%)
' 1298
64.6%
. 653
32.5%
, 58
 
2.9%
Dash Punctuation
ValueCountFrequency (%)
- 669
81.3%
154
 
18.7%
Decimal Number
ValueCountFrequency (%)
2 13
92.9%
4 1
 
7.1%
Space Separator
ValueCountFrequency (%)
107538
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1340450
92.4%
Common 110434
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 144011
10.7%
o 142674
10.6%
t 126041
 
9.4%
u 110455
 
8.2%
a 92090
 
6.9%
C 86849
 
6.5%
y 83562
 
6.2%
e 75888
 
5.7%
r 67047
 
5.0%
i 59192
 
4.4%
Other values (66) 352641
26.3%
Common
ValueCountFrequency (%)
107538
97.4%
' 1298
 
1.2%
- 669
 
0.6%
. 653
 
0.6%
154
 
0.1%
, 58
 
0.1%
( 22
 
< 0.1%
) 22
 
< 0.1%
2 13
 
< 0.1%
ʻ 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1448543
99.8%
None 2168
 
0.1%
Punctuation 154
 
< 0.1%
Latin Ext Additional 13
 
< 0.1%
Modifier Letters 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 144011
 
9.9%
o 142674
 
9.8%
t 126041
 
8.7%
u 110455
 
7.6%
107538
 
7.4%
a 92090
 
6.4%
C 86849
 
6.0%
y 83562
 
5.8%
e 75888
 
5.2%
r 67047
 
4.6%
Other values (51) 412388
28.5%
None
ValueCountFrequency (%)
ó 969
44.7%
á 512
23.6%
í 397
18.3%
é 78
 
3.6%
ú 66
 
3.0%
Ø 51
 
2.4%
ü 40
 
1.8%
ñ 21
 
1.0%
ō 15
 
0.7%
ū 6
 
0.3%
Other values (10) 13
 
0.6%
Punctuation
ValueCountFrequency (%)
154
100.0%
Latin Ext Additional
ValueCountFrequency (%)
10
76.9%
1
 
7.7%
1
 
7.7%
ế 1
 
7.7%
Modifier Letters
ValueCountFrequency (%)
ʻ 6
100.0%

locality
Text

Missing 

Distinct63950
Distinct (%)15.6%
Missing45088
Missing (%)9.9%
Memory size3.5 MiB
2025-01-14T11:51:18.107222image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length653
Median length273
Mean length54.14043574
Min length1

Characters and Unicode

Total characters22205808
Distinct characters113
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31251 ?
Unique (%)7.6%

Sample

1st rowHawaii
2nd rowTampa, Florida
3rd rowTokyo, Japan
4th rowWest Virginia, Randolph County, Shaver's Fork at Cheat Bridge on US Route 250 (Durbin Quad)
5th rowNo Data
ValueCountFrequency (%)
of 176514
 
5.1%
island 102331
 
3.0%
islands 48968
 
1.4%
bay 45485
 
1.3%
river 43650
 
1.3%
reef 43102
 
1.2%
off 42173
 
1.2%
and 41059
 
1.2%
at 38682
 
1.1%
south 38330
 
1.1%
Other values (37099) 2831165
82.0%
2025-01-14T11:51:18.532334image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3041308
 
13.7%
a 2148043
 
9.7%
e 1545845
 
7.0%
o 1444047
 
6.5%
n 1302137
 
5.9%
i 1144216
 
5.2%
t 1048275
 
4.7%
r 1048014
 
4.7%
s 988514
 
4.5%
l 838005
 
3.8%
Other values (103) 7657404
34.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15649899
70.5%
Space Separator 3041308
 
13.7%
Uppercase Letter 2205722
 
9.9%
Other Punctuation 922517
 
4.2%
Decimal Number 240602
 
1.1%
Open Punctuation 53108
 
0.2%
Close Punctuation 53073
 
0.2%
Dash Punctuation 38007
 
0.2%
Math Symbol 1469
 
< 0.1%
Other Symbol 56
 
< 0.1%
Other values (7) 47
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2148043
13.7%
e 1545845
9.9%
o 1444047
9.2%
n 1302137
 
8.3%
i 1144216
 
7.3%
t 1048275
 
6.7%
r 1048014
 
6.7%
s 988514
 
6.3%
l 838005
 
5.4%
u 586839
 
3.7%
Other values (29) 3555964
22.7%
Uppercase Letter
ValueCountFrequency (%)
S 210987
 
9.6%
C 210056
 
9.5%
I 192900
 
8.7%
B 180030
 
8.2%
P 172489
 
7.8%
M 151766
 
6.9%
R 133811
 
6.1%
N 108625
 
4.9%
A 99757
 
4.5%
T 89943
 
4.1%
Other values (18) 655358
29.7%
Other Punctuation
ValueCountFrequency (%)
, 682792
74.0%
. 154061
 
16.7%
; 33357
 
3.6%
: 23886
 
2.6%
' 15305
 
1.7%
/ 6812
 
0.7%
" 3670
 
0.4%
? 1335
 
0.1%
# 479
 
0.1%
* 470
 
0.1%
Other values (3) 350
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 46826
19.5%
0 42202
17.5%
2 35961
14.9%
5 25416
10.6%
3 22664
9.4%
4 19159
8.0%
7 13345
 
5.5%
6 13208
 
5.5%
8 12181
 
5.1%
9 9640
 
4.0%
Math Symbol
ValueCountFrequency (%)
= 1198
81.6%
~ 135
 
9.2%
+ 132
 
9.0%
> 4
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 48049
90.5%
[ 5050
 
9.5%
{ 9
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 48011
90.5%
] 5056
 
9.5%
} 6
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 38004
> 99.9%
3
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
9
81.8%
2
 
18.2%
Modifier Symbol
ValueCountFrequency (%)
´ 2
66.7%
^ 1
33.3%
Space Separator
ValueCountFrequency (%)
3041308
100.0%
Other Symbol
ValueCountFrequency (%)
° 56
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 25
100.0%
Control
ValueCountFrequency (%)
 4
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%
Other Number
ValueCountFrequency (%)
½ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17855621
80.4%
Common 4350187
 
19.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2148043
 
12.0%
e 1545845
 
8.7%
o 1444047
 
8.1%
n 1302137
 
7.3%
i 1144216
 
6.4%
t 1048275
 
5.9%
r 1048014
 
5.9%
s 988514
 
5.5%
l 838005
 
4.7%
u 586839
 
3.3%
Other values (57) 5761686
32.3%
Common
ValueCountFrequency (%)
3041308
69.9%
, 682792
 
15.7%
. 154061
 
3.5%
( 48049
 
1.1%
) 48011
 
1.1%
1 46826
 
1.1%
0 42202
 
1.0%
- 38004
 
0.9%
2 35961
 
0.8%
; 33357
 
0.8%
Other values (36) 179616
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22205350
> 99.9%
None 441
 
< 0.1%
Punctuation 16
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3041308
 
13.7%
a 2148043
 
9.7%
e 1545845
 
7.0%
o 1444047
 
6.5%
n 1302137
 
5.9%
i 1144216
 
5.2%
t 1048275
 
4.7%
r 1048014
 
4.7%
s 988514
 
4.5%
l 838005
 
3.8%
Other values (79) 7656946
34.5%
None
ValueCountFrequency (%)
á 148
33.6%
é 71
16.1%
ã 67
15.2%
° 56
 
12.7%
ø 37
 
8.4%
à 14
 
3.2%
ó 14
 
3.2%
í 10
 
2.3%
Ù 5
 
1.1%
 4
 
0.9%
Other values (9) 15
 
3.4%
Punctuation
ValueCountFrequency (%)
9
56.2%
3
 
18.8%
2
 
12.5%
2
 
12.5%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

verbatimElevation
Text

Missing 

Distinct76
Distinct (%)3.4%
Missing453036
Missing (%)99.5%
Memory size3.5 MiB
2025-01-14T11:51:18.734957image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length152
Median length68
Mean length46.38838475
Min length3

Characters and Unicode

Total characters102240
Distinct characters70
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)0.6%

Sample

1st rowRotenone put out at 90' and 120', pickup was surface to 140', several (fiscos=factors?) prevented an even better collection.
2nd rowDistance from shore: 1000 feet
3rd row32 not found in field notes so could be inaccurate.
4th rowDistance from shore: 1500 feet
5th rowNaso was speared by P.W. (Paul D. West)
ValueCountFrequency (%)
feet 1680
 
9.0%
distance 1141
 
6.1%
from 1097
 
5.9%
to 1064
 
5.7%
shore 1048
 
5.6%
at 595
 
3.2%
499
 
2.7%
and 445
 
2.4%
rotenone 430
 
2.3%
put 309
 
1.6%
Other values (175) 10428
55.7%
2025-01-14T11:51:18.980222image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16532
16.2%
e 10266
 
10.0%
t 7363
 
7.2%
o 6811
 
6.7%
a 5065
 
5.0%
s 4792
 
4.7%
f 4205
 
4.1%
n 4144
 
4.1%
r 4103
 
4.0%
0 3353
 
3.3%
Other values (60) 35606
34.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 67443
66.0%
Space Separator 16532
 
16.2%
Decimal Number 7713
 
7.5%
Uppercase Letter 5089
 
5.0%
Other Punctuation 3746
 
3.7%
Dash Punctuation 742
 
0.7%
Open Punctuation 462
 
0.5%
Close Punctuation 462
 
0.5%
Math Symbol 51
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 10266
15.2%
t 7363
10.9%
o 6811
10.1%
a 5065
 
7.5%
s 4792
 
7.1%
f 4205
 
6.2%
n 4144
 
6.1%
r 4103
 
6.1%
i 3239
 
4.8%
c 2531
 
3.8%
Other values (15) 14924
22.1%
Uppercase Letter
ValueCountFrequency (%)
D 1545
30.4%
T 692
13.6%
P 592
 
11.6%
W 574
 
11.3%
A 431
 
8.5%
R 430
 
8.4%
C 159
 
3.1%
N 135
 
2.7%
G 90
 
1.8%
V 77
 
1.5%
Other values (10) 364
 
7.2%
Decimal Number
ValueCountFrequency (%)
0 3353
43.5%
1 1286
 
16.7%
5 905
 
11.7%
2 800
 
10.4%
7 383
 
5.0%
6 218
 
2.8%
8 215
 
2.8%
3 210
 
2.7%
4 199
 
2.6%
9 144
 
1.9%
Other Punctuation
ValueCountFrequency (%)
. 1561
41.7%
: 1444
38.5%
, 292
 
7.8%
' 270
 
7.2%
" 97
 
2.6%
? 51
 
1.4%
; 30
 
0.8%
/ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 318
68.8%
[ 144
31.2%
Close Punctuation
ValueCountFrequency (%)
) 318
68.8%
] 144
31.2%
Space Separator
ValueCountFrequency (%)
16532
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 742
100.0%
Math Symbol
ValueCountFrequency (%)
= 51
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 72532
70.9%
Common 29708
29.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 10266
14.2%
t 7363
 
10.2%
o 6811
 
9.4%
a 5065
 
7.0%
s 4792
 
6.6%
f 4205
 
5.8%
n 4144
 
5.7%
r 4103
 
5.7%
i 3239
 
4.5%
c 2531
 
3.5%
Other values (35) 20013
27.6%
Common
ValueCountFrequency (%)
16532
55.6%
0 3353
 
11.3%
. 1561
 
5.3%
: 1444
 
4.9%
1 1286
 
4.3%
5 905
 
3.0%
2 800
 
2.7%
- 742
 
2.5%
7 383
 
1.3%
( 318
 
1.1%
Other values (15) 2384
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 102240
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16532
16.2%
e 10266
 
10.0%
t 7363
 
7.2%
o 6811
 
6.7%
a 5065
 
5.0%
s 4792
 
4.7%
f 4205
 
4.1%
n 4144
 
4.1%
r 4103
 
4.0%
0 3353
 
3.3%
Other values (60) 35606
34.8%

minimumDepthInMeters
Text

Missing 

Distinct1922
Distinct (%)0.9%
Missing249167
Missing (%)54.7%
Memory size3.5 MiB
2025-01-14T11:51:19.182802image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length3
Mean length3.591271054
Min length3

Characters and Unicode

Total characters740064
Distinct characters27
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique370 ?
Unique (%)0.2%

Sample

1st row60.0
2nd row0.0
3rd row37.0
4th row7.0
5th row2.0
ValueCountFrequency (%)
0.0 75887
36.8%
2.0 10601
 
5.1%
1.0 8982
 
4.4%
3.0 6928
 
3.4%
5.0 4090
 
2.0%
4.0 3213
 
1.6%
9.0 2694
 
1.3%
6.0 2664
 
1.3%
18.0 2630
 
1.3%
15.0 2565
 
1.2%
Other values (1913) 85820
41.6%
2025-01-14T11:51:19.533008image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 303880
41.1%
. 206071
27.8%
1 50031
 
6.8%
2 40859
 
5.5%
5 26866
 
3.6%
3 26806
 
3.6%
4 21203
 
2.9%
6 17768
 
2.4%
8 16376
 
2.2%
7 15498
 
2.1%
Other values (17) 14706
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 533960
72.2%
Other Punctuation 206071
 
27.8%
Lowercase Letter 30
 
< 0.1%
Uppercase Letter 2
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 4
13.3%
r 3
10.0%
u 3
10.0%
n 3
10.0%
t 3
10.0%
p 2
6.7%
h 2
6.7%
y 2
6.7%
i 2
6.7%
o 2
6.7%
Other values (3) 4
13.3%
Decimal Number
ValueCountFrequency (%)
0 303880
56.9%
1 50031
 
9.4%
2 40859
 
7.7%
5 26866
 
5.0%
3 26806
 
5.0%
4 21203
 
4.0%
6 17768
 
3.3%
8 16376
 
3.1%
7 15498
 
2.9%
9 14673
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
L 1
50.0%
O 1
50.0%
Other Punctuation
ValueCountFrequency (%)
. 206071
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 740032
> 99.9%
Latin 32
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 4
12.5%
r 3
9.4%
u 3
9.4%
n 3
9.4%
t 3
9.4%
p 2
 
6.2%
h 2
 
6.2%
y 2
 
6.2%
i 2
 
6.2%
o 2
 
6.2%
Other values (5) 6
18.8%
Common
ValueCountFrequency (%)
0 303880
41.1%
. 206071
27.8%
1 50031
 
6.8%
2 40859
 
5.5%
5 26866
 
3.6%
3 26806
 
3.6%
4 21203
 
2.9%
6 17768
 
2.4%
8 16376
 
2.2%
7 15498
 
2.1%
Other values (2) 14674
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 740064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 303880
41.1%
. 206071
27.8%
1 50031
 
6.8%
2 40859
 
5.5%
5 26866
 
3.6%
3 26806
 
3.6%
4 21203
 
2.9%
6 17768
 
2.4%
8 16376
 
2.2%
7 15498
 
2.1%
Other values (17) 14706
 
2.0%

maximumDepthInMeters
Text

Missing 

Distinct2019
Distinct (%)1.1%
Missing263906
Missing (%)58.0%
Memory size3.5 MiB
2025-01-14T11:51:19.831000image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length3.804828206
Min length3

Characters and Unicode

Total characters727993
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique363 ?
Unique (%)0.2%

Sample

1st row39.0
2nd row4.6
3rd row46.0
4th row8.0
5th row5.0
ValueCountFrequency (%)
1.0 16618
 
8.7%
2.0 10759
 
5.6%
5.0 9941
 
5.2%
3.0 8373
 
4.4%
6.0 7357
 
3.8%
0.0 5339
 
2.8%
8.0 4959
 
2.6%
9.0 4731
 
2.5%
4.0 4712
 
2.5%
12.0 4114
 
2.2%
Other values (2009) 114431
59.8%
2025-01-14T11:51:20.173272image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 227761
31.3%
. 191334
26.3%
1 75098
 
10.3%
2 48997
 
6.7%
5 41714
 
5.7%
3 35554
 
4.9%
4 24425
 
3.4%
6 23834
 
3.3%
8 22208
 
3.1%
7 19020
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 536659
73.7%
Other Punctuation 191334
 
26.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 227761
42.4%
1 75098
 
14.0%
2 48997
 
9.1%
5 41714
 
7.8%
3 35554
 
6.6%
4 24425
 
4.6%
6 23834
 
4.4%
8 22208
 
4.1%
7 19020
 
3.5%
9 18048
 
3.4%
Other Punctuation
ValueCountFrequency (%)
. 191334
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 727993
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 227761
31.3%
. 191334
26.3%
1 75098
 
10.3%
2 48997
 
6.7%
5 41714
 
5.7%
3 35554
 
4.9%
4 24425
 
3.4%
6 23834
 
3.3%
8 22208
 
3.1%
7 19020
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 727993
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 227761
31.3%
. 191334
26.3%
1 75098
 
10.3%
2 48997
 
6.7%
5 41714
 
5.7%
3 35554
 
4.9%
4 24425
 
3.4%
6 23834
 
3.3%
8 22208
 
3.1%
7 19020
 
2.6%

verbatimDepth
Text

Missing 

Distinct231
Distinct (%)2.7%
Missing446663
Missing (%)98.1%
Memory size3.5 MiB
2025-01-14T11:51:20.376888image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length38644
Median length72
Mean length12.75434301
Min length1

Characters and Unicode

Total characters109394
Distinct characters83
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique96 ?
Unique (%)1.1%

Sample

1st rowDepth trawl: 135 fathoms
2nd row15 minutes at depth
3rd rowSurface
4th rowCA
5th row15 minutes at depth
ValueCountFrequency (%)
ca 3933
21.2%
surface 2351
 
12.6%
depth 867
 
4.7%
at 574
 
3.1%
00000000 543
 
2.9%
to 508
 
2.7%
minutes 357
 
1.9%
fathoms 330
 
1.8%
m 326
 
1.8%
trawl 287
 
1.5%
Other values (1330) 8517
45.8%
2025-01-14T11:51:20.656414image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8646
 
7.9%
8475
 
7.7%
7888
 
7.2%
e 6992
 
6.4%
a 6272
 
5.7%
t 5576
 
5.1%
r 4792
 
4.4%
A 4421
 
4.0%
C 4147
 
3.8%
f 3689
 
3.4%
Other values (73) 48496
44.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 57894
52.9%
Decimal Number 16408
 
15.0%
Uppercase Letter 14037
 
12.8%
Space Separator 8475
 
7.7%
Control 7930
 
7.2%
Other Punctuation 3607
 
3.3%
Dash Punctuation 944
 
0.9%
Math Symbol 46
 
< 0.1%
Open Punctuation 38
 
< 0.1%
Close Punctuation 15
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 6992
12.1%
a 6272
10.8%
t 5576
 
9.6%
r 4792
 
8.3%
f 3689
 
6.4%
c 3660
 
6.3%
u 3656
 
6.3%
o 3587
 
6.2%
i 2862
 
4.9%
n 2526
 
4.4%
Other values (16) 14282
24.7%
Uppercase Letter
ValueCountFrequency (%)
A 4421
31.5%
C 4147
29.5%
S 2599
18.5%
N 350
 
2.5%
M 329
 
2.3%
O 311
 
2.2%
P 241
 
1.7%
H 216
 
1.5%
D 210
 
1.5%
T 206
 
1.5%
Other values (14) 1007
 
7.2%
Other Punctuation
ValueCountFrequency (%)
, 1073
29.7%
. 970
26.9%
: 623
17.3%
' 408
 
11.3%
/ 249
 
6.9%
; 156
 
4.3%
" 106
 
2.9%
? 10
 
0.3%
& 6
 
0.2%
# 4
 
0.1%
Other values (2) 2
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 8646
52.7%
1 1539
 
9.4%
5 1278
 
7.8%
3 1030
 
6.3%
2 928
 
5.7%
9 767
 
4.7%
6 643
 
3.9%
8 567
 
3.5%
4 527
 
3.2%
7 483
 
2.9%
Math Symbol
ValueCountFrequency (%)
< 23
50.0%
= 22
47.8%
~ 1
 
2.2%
Control
ValueCountFrequency (%)
7888
99.5%
42
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 37
97.4%
[ 1
 
2.6%
Close Punctuation
ValueCountFrequency (%)
) 14
93.3%
] 1
 
6.7%
Space Separator
ValueCountFrequency (%)
8475
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 944
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 71931
65.8%
Common 37463
34.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 6992
 
9.7%
a 6272
 
8.7%
t 5576
 
7.8%
r 4792
 
6.7%
A 4421
 
6.1%
C 4147
 
5.8%
f 3689
 
5.1%
c 3660
 
5.1%
u 3656
 
5.1%
o 3587
 
5.0%
Other values (40) 25139
34.9%
Common
ValueCountFrequency (%)
0 8646
23.1%
8475
22.6%
7888
21.1%
1 1539
 
4.1%
5 1278
 
3.4%
, 1073
 
2.9%
3 1030
 
2.7%
. 970
 
2.6%
- 944
 
2.5%
2 928
 
2.5%
Other values (23) 4692
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 109394
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8646
 
7.9%
8475
 
7.7%
7888
 
7.2%
e 6992
 
6.4%
a 6272
 
5.7%
t 5576
 
5.1%
r 4792
 
4.4%
A 4421
 
4.0%
C 4147
 
3.8%
f 3689
 
3.4%
Other values (73) 48496
44.3%

locationRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455239
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:20.718010image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters20
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowWilliams, Jeffrey T.
ValueCountFrequency (%)
williams 1
33.3%
jeffrey 1
33.3%
t 1
33.3%
2025-01-14T11:51:20.829692image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2
 
10.0%
l 2
 
10.0%
2
 
10.0%
e 2
 
10.0%
f 2
 
10.0%
W 1
 
5.0%
a 1
 
5.0%
m 1
 
5.0%
s 1
 
5.0%
, 1
 
5.0%
Other values (5) 5
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13
65.0%
Uppercase Letter 3
 
15.0%
Space Separator 2
 
10.0%
Other Punctuation 2
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2
15.4%
l 2
15.4%
e 2
15.4%
f 2
15.4%
a 1
7.7%
m 1
7.7%
s 1
7.7%
r 1
7.7%
y 1
7.7%
Uppercase Letter
ValueCountFrequency (%)
W 1
33.3%
J 1
33.3%
T 1
33.3%
Other Punctuation
ValueCountFrequency (%)
, 1
50.0%
. 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16
80.0%
Common 4
 
20.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2
12.5%
l 2
12.5%
e 2
12.5%
f 2
12.5%
W 1
6.2%
a 1
6.2%
m 1
6.2%
s 1
6.2%
J 1
6.2%
r 1
6.2%
Other values (2) 2
12.5%
Common
ValueCountFrequency (%)
2
50.0%
, 1
25.0%
. 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2
 
10.0%
l 2
 
10.0%
2
 
10.0%
e 2
 
10.0%
f 2
 
10.0%
W 1
 
5.0%
a 1
 
5.0%
m 1
 
5.0%
s 1
 
5.0%
, 1
 
5.0%
Other values (5) 5
25.0%

decimalLatitude
Text

Missing 

Distinct15628
Distinct (%)7.8%
Missing254361
Missing (%)55.9%
Memory size3.5 MiB
2025-01-14T11:51:21.050870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length134
Median length131
Mean length6.00106034
Min length3

Characters and Unicode

Total characters1205487
Distinct characters39
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4362 ?
Unique (%)2.2%

Sample

1st row13.2431
2nd row10.9181
3rd row31.93
4th row10.72
5th row-2.0517
ValueCountFrequency (%)
12.5 1211
 
0.6%
27.9 868
 
0.4%
16.8 711
 
0.4%
12.0832 620
 
0.3%
21.417 545
 
0.3%
19.1606 541
 
0.3%
32.23 510
 
0.3%
32.17 503
 
0.3%
32.3 491
 
0.2%
28.4933 489
 
0.2%
Other values (14225) 194409
96.8%
2025-01-14T11:51:21.386075image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 200877
16.7%
3 140035
11.6%
1 139623
11.6%
2 127102
10.5%
8 92279
7.7%
7 91697
7.6%
5 84652
7.0%
4 71048
 
5.9%
6 69782
 
5.8%
9 69539
 
5.8%
Other values (29) 118853
9.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 946801
78.5%
Other Punctuation 200896
 
16.7%
Dash Punctuation 57544
 
4.8%
Lowercase Letter 206
 
< 0.1%
Uppercase Letter 21
 
< 0.1%
Space Separator 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 33
16.0%
e 26
12.6%
t 22
10.7%
a 18
8.7%
r 17
8.3%
o 16
 
7.8%
h 9
 
4.4%
c 9
 
4.4%
y 8
 
3.9%
n 8
 
3.9%
Other values (8) 40
19.4%
Decimal Number
ValueCountFrequency (%)
3 140035
14.8%
1 139623
14.7%
2 127102
13.4%
8 92279
9.7%
7 91697
9.7%
5 84652
8.9%
4 71048
7.5%
6 69782
7.4%
9 69539
7.3%
0 61044
6.4%
Uppercase Letter
ValueCountFrequency (%)
A 6
28.6%
P 3
14.3%
G 3
14.3%
O 3
14.3%
V 2
 
9.5%
N 2
 
9.5%
C 2
 
9.5%
Other Punctuation
ValueCountFrequency (%)
. 200877
> 99.9%
, 19
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 57544
100.0%
Space Separator
ValueCountFrequency (%)
19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1205260
> 99.9%
Latin 227
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 33
14.5%
e 26
11.5%
t 22
 
9.7%
a 18
 
7.9%
r 17
 
7.5%
o 16
 
7.0%
h 9
 
4.0%
c 9
 
4.0%
y 8
 
3.5%
n 8
 
3.5%
Other values (15) 61
26.9%
Common
ValueCountFrequency (%)
. 200877
16.7%
3 140035
11.6%
1 139623
11.6%
2 127102
10.5%
8 92279
7.7%
7 91697
7.6%
5 84652
7.0%
4 71048
 
5.9%
6 69782
 
5.8%
9 69539
 
5.8%
Other values (4) 118626
9.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1205487
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 200877
16.7%
3 140035
11.6%
1 139623
11.6%
2 127102
10.5%
8 92279
7.7%
7 91697
7.6%
5 84652
7.0%
4 71048
 
5.9%
6 69782
 
5.8%
9 69539
 
5.8%
Other values (29) 118853
9.9%

decimalLongitude
Text

Missing 

Distinct17138
Distinct (%)8.5%
Missing254361
Missing (%)55.9%
Memory size3.5 MiB
2025-01-14T11:51:21.641328image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.717337303
Min length3

Characters and Unicode

Total characters1349372
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5090 ?
Unique (%)2.5%

Sample

1st row-59.6561
2nd row121.034
3rd row-63.95
4th row-67.88
5th row130.107
ValueCountFrequency (%)
177.083 872
 
0.4%
93.717 815
 
0.4%
88.08 737
 
0.4%
68.8991 618
 
0.3%
64.0 564
 
0.3%
158.417 546
 
0.3%
179.756 541
 
0.3%
162.875 490
 
0.2%
165.83 469
 
0.2%
84.9317 454
 
0.2%
Other values (16299) 194773
97.0%
2025-01-14T11:51:21.961692image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 200877
14.9%
1 164623
12.2%
7 129011
9.6%
- 125602
9.3%
8 119389
8.8%
3 100047
7.4%
6 99833
7.4%
2 97899
7.3%
5 93339
6.9%
4 79131
 
5.9%
Other values (8) 139621
10.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1022877
75.8%
Other Punctuation 200877
 
14.9%
Dash Punctuation 125602
 
9.3%
Lowercase Letter 14
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 164623
16.1%
7 129011
12.6%
8 119389
11.7%
3 100047
9.8%
6 99833
9.8%
2 97899
9.6%
5 93339
9.1%
4 79131
7.7%
9 76337
7.5%
0 63268
 
6.2%
Lowercase Letter
ValueCountFrequency (%)
i 4
28.6%
a 4
28.6%
n 2
14.3%
m 2
14.3%
l 2
14.3%
Other Punctuation
ValueCountFrequency (%)
. 200877
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 125602
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1349356
> 99.9%
Latin 16
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
. 200877
14.9%
1 164623
12.2%
7 129011
9.6%
- 125602
9.3%
8 119389
8.8%
3 100047
7.4%
6 99833
7.4%
2 97899
7.3%
5 93339
6.9%
4 79131
 
5.9%
Other values (2) 139605
10.3%
Latin
ValueCountFrequency (%)
i 4
25.0%
a 4
25.0%
A 2
12.5%
n 2
12.5%
m 2
12.5%
l 2
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1349372
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 200877
14.9%
1 164623
12.2%
7 129011
9.6%
- 125602
9.3%
8 119389
8.8%
3 100047
7.4%
6 99833
7.4%
2 97899
7.3%
5 93339
6.9%
4 79131
 
5.9%
Other values (8) 139621
10.3%

geodeticDatum
Text

Missing 

Distinct6
Distinct (%)0.1%
Missing448060
Missing (%)98.4%
Memory size3.5 MiB
2025-01-14T11:51:22.029748image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length18
Mean length17.98649025
Min length5

Characters and Unicode

Total characters129143
Distinct characters26
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWGS 84 (EPSG:4326)
2nd rowWGS 84 (EPSG:4326)
3rd rowWGS 84 (EPSG:4326)
4th rowWGS 84 (EPSG:4326)
5th rowWGS 84 (EPSG:4326)
ValueCountFrequency (%)
wgs 7076
32.9%
84 7076
32.9%
epsg:4326 7027
32.7%
nad83 69
 
0.3%
epsg:4269 69
 
0.3%
epsg 49
 
0.2%
4326 49
 
0.2%
nad27 31
 
0.1%
epsg:4267 31
 
0.1%
wgs84 2
 
< 0.1%
2025-01-14T11:51:22.149914image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14301
11.1%
G 14254
11.0%
S 14254
11.0%
4 14254
11.0%
2 7207
 
5.6%
) 7176
 
5.6%
( 7176
 
5.6%
E 7176
 
5.6%
P 7176
 
5.6%
: 7176
 
5.6%
Other values (16) 28993
22.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 50240
38.9%
Decimal Number 43060
33.3%
Space Separator 14301
 
11.1%
Close Punctuation 7176
 
5.6%
Open Punctuation 7176
 
5.6%
Other Punctuation 7176
 
5.6%
Lowercase Letter 14
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
G 14254
28.4%
S 14254
28.4%
E 7176
14.3%
P 7176
14.3%
W 7078
14.1%
N 100
 
0.2%
A 100
 
0.2%
D 100
 
0.2%
C 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
4 14254
33.1%
2 7207
16.7%
6 7176
16.7%
8 7147
16.6%
3 7145
16.6%
9 69
 
0.2%
7 62
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
a 4
28.6%
h 2
14.3%
o 2
14.3%
r 2
14.3%
d 2
14.3%
t 2
14.3%
Space Separator
ValueCountFrequency (%)
14301
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7176
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7176
100.0%
Other Punctuation
ValueCountFrequency (%)
: 7176
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 78889
61.1%
Latin 50254
38.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 14254
28.4%
S 14254
28.4%
E 7176
14.3%
P 7176
14.3%
W 7078
14.1%
N 100
 
0.2%
A 100
 
0.2%
D 100
 
0.2%
a 4
 
< 0.1%
C 2
 
< 0.1%
Other values (5) 10
 
< 0.1%
Common
ValueCountFrequency (%)
14301
18.1%
4 14254
18.1%
2 7207
9.1%
) 7176
9.1%
( 7176
9.1%
: 7176
9.1%
6 7176
9.1%
8 7147
9.1%
3 7145
9.1%
9 69
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 129143
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
14301
11.1%
G 14254
11.0%
S 14254
11.0%
4 14254
11.0%
2 7207
 
5.6%
) 7176
 
5.6%
( 7176
 
5.6%
E 7176
 
5.6%
P 7176
 
5.6%
: 7176
 
5.6%
Other values (16) 28993
22.5%
Distinct221
Distinct (%)4.3%
Missing450085
Missing (%)98.9%
Memory size3.5 MiB
2025-01-14T11:51:22.303257image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length7
Mean length3.784869059
Min length2

Characters and Unicode

Total characters19511
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)0.7%

Sample

1st row100
2nd row457
3rd row739
4th row100
5th row8438
ValueCountFrequency (%)
100 1109
21.5%
10000 832
 
16.1%
3704 209
 
4.1%
500 188
 
3.6%
5000 122
 
2.4%
278076 107
 
2.1%
441 89
 
1.7%
330 83
 
1.6%
50 78
 
1.5%
3512 73
 
1.4%
Other values (211) 2265
43.9%
2025-01-14T11:51:22.542873image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8004
41.0%
1 3231
16.6%
2 1521
 
7.8%
4 1387
 
7.1%
3 1195
 
6.1%
5 1165
 
6.0%
6 828
 
4.2%
7 776
 
4.0%
8 720
 
3.7%
9 629
 
3.2%
Other values (12) 55
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19456
99.7%
Other Punctuation 27
 
0.1%
Lowercase Letter 26
 
0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8004
41.1%
1 3231
16.6%
2 1521
 
7.8%
4 1387
 
7.1%
3 1195
 
6.1%
5 1165
 
6.0%
6 828
 
4.3%
7 776
 
4.0%
8 720
 
3.7%
9 629
 
3.2%
Lowercase Letter
ValueCountFrequency (%)
i 6
23.1%
t 4
15.4%
p 2
 
7.7%
y 2
 
7.7%
r 2
 
7.7%
e 2
 
7.7%
o 2
 
7.7%
n 2
 
7.7%
c 2
 
7.7%
g 2
 
7.7%
Other Punctuation
ValueCountFrequency (%)
. 27
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 19483
99.9%
Latin 28
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8004
41.1%
1 3231
16.6%
2 1521
 
7.8%
4 1387
 
7.1%
3 1195
 
6.1%
5 1165
 
6.0%
6 828
 
4.2%
7 776
 
4.0%
8 720
 
3.7%
9 629
 
3.2%
Latin
ValueCountFrequency (%)
i 6
21.4%
t 4
14.3%
p 2
 
7.1%
y 2
 
7.1%
r 2
 
7.1%
e 2
 
7.1%
A 2
 
7.1%
o 2
 
7.1%
n 2
 
7.1%
c 2
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19511
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8004
41.0%
1 3231
16.6%
2 1521
 
7.8%
4 1387
 
7.1%
3 1195
 
6.1%
5 1165
 
6.0%
6 828
 
4.2%
7 776
 
4.0%
8 720
 
3.7%
9 629
 
3.2%
Other values (12) 55
 
0.3%

coordinatePrecision
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing455238
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:22.622171image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters22
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPerciformes
2nd rowPerciformes
ValueCountFrequency (%)
perciformes 2
100.0%
2025-01-14T11:51:22.750785image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 4
18.2%
r 4
18.2%
P 2
9.1%
c 2
9.1%
i 2
9.1%
f 2
9.1%
o 2
9.1%
m 2
9.1%
s 2
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20
90.9%
Uppercase Letter 2
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4
20.0%
r 4
20.0%
c 2
10.0%
i 2
10.0%
f 2
10.0%
o 2
10.0%
m 2
10.0%
s 2
10.0%
Uppercase Letter
ValueCountFrequency (%)
P 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4
18.2%
r 4
18.2%
P 2
9.1%
c 2
9.1%
i 2
9.1%
f 2
9.1%
o 2
9.1%
m 2
9.1%
s 2
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4
18.2%
r 4
18.2%
P 2
9.1%
c 2
9.1%
i 2
9.1%
f 2
9.1%
o 2
9.1%
m 2
9.1%
s 2
9.1%

verbatimCoordinates
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing455238
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:22.806877image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length11.5
Mean length11.5
Min length8

Characters and Unicode

Total characters23
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowOpistognathidae
2nd rowGobiidae
ValueCountFrequency (%)
opistognathidae 1
50.0%
gobiidae 1
50.0%
2025-01-14T11:51:22.955961image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 4
17.4%
a 3
13.0%
t 2
8.7%
o 2
8.7%
d 2
8.7%
e 2
8.7%
O 1
 
4.3%
p 1
 
4.3%
s 1
 
4.3%
g 1
 
4.3%
Other values (4) 4
17.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21
91.3%
Uppercase Letter 2
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4
19.0%
a 3
14.3%
t 2
9.5%
o 2
9.5%
d 2
9.5%
e 2
9.5%
p 1
 
4.8%
s 1
 
4.8%
g 1
 
4.8%
n 1
 
4.8%
Other values (2) 2
9.5%
Uppercase Letter
ValueCountFrequency (%)
O 1
50.0%
G 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4
17.4%
a 3
13.0%
t 2
8.7%
o 2
8.7%
d 2
8.7%
e 2
8.7%
O 1
 
4.3%
p 1
 
4.3%
s 1
 
4.3%
g 1
 
4.3%
Other values (4) 4
17.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 4
17.4%
a 3
13.0%
t 2
8.7%
o 2
8.7%
d 2
8.7%
e 2
8.7%
O 1
 
4.3%
p 1
 
4.3%
s 1
 
4.3%
g 1
 
4.3%
Other values (4) 4
17.4%

verbatimLatitude
Text

Missing 

Distinct14112
Distinct (%)7.8%
Missing275360
Missing (%)60.5%
Memory size3.5 MiB
2025-01-14T11:51:23.145526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length46
Median length39
Mean length7.521342006
Min length2

Characters and Unicode

Total characters1352939
Distinct characters37
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3903 ?
Unique (%)2.2%

Sample

1st row13.243 N
2nd row105505N
3rd row3156 N
4th row1043 N
5th row020306S
ValueCountFrequency (%)
n 84855
 
24.0%
s 26346
 
7.5%
00 4829
 
1.4%
28 4344
 
1.2%
30 4225
 
1.2%
21 2906
 
0.8%
27 2437
 
0.7%
13 2217
 
0.6%
12 2197
 
0.6%
25 2007
 
0.6%
Other values (10716) 217209
61.4%
2025-01-14T11:51:23.438202image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 175112
12.9%
173692
12.8%
1 149455
11.0%
2 131496
9.7%
3 130094
9.6%
N 123256
9.1%
4 90672
6.7%
5 88496
6.5%
8 56508
 
4.2%
9 51810
 
3.8%
Other values (27) 182348
13.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 965626
71.4%
Space Separator 173692
 
12.8%
Uppercase Letter 173347
 
12.8%
Other Punctuation 38615
 
2.9%
Dash Punctuation 840
 
0.1%
Other Symbol 782
 
0.1%
Lowercase Letter 36
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 175112
18.1%
1 149455
15.5%
2 131496
13.6%
3 130094
13.5%
4 90672
9.4%
5 88496
9.2%
8 56508
 
5.9%
9 51810
 
5.4%
7 46599
 
4.8%
6 45384
 
4.7%
Lowercase Letter
ValueCountFrequency (%)
d 9
25.0%
e 9
25.0%
g 9
25.0%
a 2
 
5.6%
w 2
 
5.6%
c 1
 
2.8%
l 1
 
2.8%
o 1
 
2.8%
r 1
 
2.8%
n 1
 
2.8%
Other Punctuation
ValueCountFrequency (%)
. 31909
82.6%
; 4950
 
12.8%
' 1231
 
3.2%
" 456
 
1.2%
? 54
 
0.1%
: 7
 
< 0.1%
, 6
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N 123256
71.1%
S 50071
28.9%
W 11
 
< 0.1%
E 9
 
< 0.1%
Space Separator
ValueCountFrequency (%)
173692
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 840
100.0%
Other Symbol
ValueCountFrequency (%)
° 782
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1179556
87.2%
Latin 173383
 
12.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0 175112
14.8%
173692
14.7%
1 149455
12.7%
2 131496
11.1%
3 130094
11.0%
4 90672
7.7%
5 88496
7.5%
8 56508
 
4.8%
9 51810
 
4.4%
7 46599
 
4.0%
Other values (13) 85622
7.3%
Latin
ValueCountFrequency (%)
N 123256
71.1%
S 50071
28.9%
W 11
 
< 0.1%
d 9
 
< 0.1%
e 9
 
< 0.1%
g 9
 
< 0.1%
E 9
 
< 0.1%
a 2
 
< 0.1%
w 2
 
< 0.1%
c 1
 
< 0.1%
Other values (4) 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1352154
99.9%
None 782
 
0.1%
Punctuation 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 175112
13.0%
173692
12.8%
1 149455
11.1%
2 131496
9.7%
3 130094
9.6%
N 123256
9.1%
4 90672
6.7%
5 88496
6.5%
8 56508
 
4.2%
9 51810
 
3.8%
Other values (23) 181563
13.4%
None
ValueCountFrequency (%)
° 782
100.0%
Punctuation
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

verbatimLongitude
Text

Missing 

Distinct15210
Distinct (%)8.5%
Missing275421
Missing (%)60.5%
Memory size3.5 MiB
2025-01-14T11:51:23.639658image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length49
Median length46
Mean length8.465312342
Min length2

Characters and Unicode

Total characters1522224
Distinct characters35
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4426 ?
Unique (%)2.5%

Sample

1st row-59.656 W
2nd row1210203E
3rd row06357 W
4th row06753 W
5th row1300624E
ValueCountFrequency (%)
w 80109
 
22.7%
e 30683
 
8.7%
00 4553
 
1.3%
30 3562
 
1.0%
88 2942
 
0.8%
85 1702
 
0.5%
120 1682
 
0.5%
1400230e 1497
 
0.4%
121 1388
 
0.4%
77 1340
 
0.4%
Other values (12425) 223823
63.4%
2025-01-14T11:51:23.922989image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 224236
14.7%
173462
11.4%
1 166429
10.9%
5 116123
7.6%
2 108641
7.1%
W 107056
7.0%
4 102067
6.7%
3 99138
 
6.5%
7 86035
 
5.7%
6 84236
 
5.5%
Other values (25) 254801
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1127846
74.1%
Space Separator 173462
 
11.4%
Uppercase Letter 173257
 
11.4%
Other Punctuation 38483
 
2.5%
Dash Punctuation 8361
 
0.5%
Other Symbol 782
 
0.1%
Lowercase Letter 31
 
< 0.1%
Final Punctuation 1
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 224236
19.9%
1 166429
14.8%
5 116123
10.3%
2 108641
9.6%
4 102067
9.0%
3 99138
8.8%
7 86035
 
7.6%
6 84236
 
7.5%
8 82712
 
7.3%
9 58229
 
5.2%
Other Punctuation
ValueCountFrequency (%)
. 31783
82.6%
; 4948
 
12.9%
' 1222
 
3.2%
" 461
 
1.2%
? 54
 
0.1%
: 7
 
< 0.1%
, 6
 
< 0.1%
1
 
< 0.1%
1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
d 9
29.0%
e 9
29.0%
g 9
29.0%
c 1
 
3.2%
o 1
 
3.2%
t 1
 
3.2%
a 1
 
3.2%
Uppercase Letter
ValueCountFrequency (%)
W 107056
61.8%
E 66153
38.2%
N 43
 
< 0.1%
S 5
 
< 0.1%
Space Separator
ValueCountFrequency (%)
173462
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8361
100.0%
Other Symbol
ValueCountFrequency (%)
° 782
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1348936
88.6%
Latin 173288
 
11.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 224236
16.6%
173462
12.9%
1 166429
12.3%
5 116123
8.6%
2 108641
8.1%
4 102067
7.6%
3 99138
7.3%
7 86035
 
6.4%
6 84236
 
6.2%
8 82712
 
6.1%
Other values (14) 105857
7.8%
Latin
ValueCountFrequency (%)
W 107056
61.8%
E 66153
38.2%
N 43
 
< 0.1%
d 9
 
< 0.1%
e 9
 
< 0.1%
g 9
 
< 0.1%
S 5
 
< 0.1%
c 1
 
< 0.1%
o 1
 
< 0.1%
t 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1521438
99.9%
None 782
 
0.1%
Punctuation 3
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 224236
14.7%
173462
11.4%
1 166429
10.9%
5 116123
7.6%
2 108641
7.1%
W 107056
7.0%
4 102067
6.7%
3 99138
 
6.5%
7 86035
 
5.7%
6 84236
 
5.5%
Other values (20) 254015
16.7%
None
ValueCountFrequency (%)
° 782
100.0%
Punctuation
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Math Operators
ValueCountFrequency (%)
1
100.0%
Distinct3
Distinct (%)< 0.1%
Missing308955
Missing (%)67.9%
Memory size3.5 MiB
2025-01-14T11:51:23.984826image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.9275934
Min length7

Characters and Unicode

Total characters3353963
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 146277
33.4%
minutes 144969
33.1%
seconds 144969
33.1%
decimal 1308
 
0.3%
unknown 8
 
< 0.1%
2025-01-14T11:51:24.109615image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 730077
21.8%
s 436215
13.0%
291246
 
8.7%
n 289962
 
8.6%
D 146277
 
4.4%
c 146277
 
4.4%
g 146277
 
4.4%
r 146277
 
4.4%
d 146277
 
4.4%
i 146277
 
4.4%
Other values (11) 728801
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2626494
78.3%
Uppercase Letter 436223
 
13.0%
Space Separator 291246
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 730077
27.8%
s 436215
16.6%
n 289962
 
11.0%
c 146277
 
5.6%
g 146277
 
5.6%
r 146277
 
5.6%
d 146277
 
5.6%
i 146277
 
5.6%
o 144977
 
5.5%
t 144969
 
5.5%
Other values (6) 148909
 
5.7%
Uppercase Letter
ValueCountFrequency (%)
D 146277
33.5%
S 144969
33.2%
M 144969
33.2%
U 8
 
< 0.1%
Space Separator
ValueCountFrequency (%)
291246
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3062717
91.3%
Common 291246
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 730077
23.8%
s 436215
14.2%
n 289962
 
9.5%
D 146277
 
4.8%
c 146277
 
4.8%
g 146277
 
4.8%
r 146277
 
4.8%
d 146277
 
4.8%
i 146277
 
4.8%
o 144977
 
4.7%
Other values (10) 583824
19.1%
Common
ValueCountFrequency (%)
291246
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3353963
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 730077
21.8%
s 436215
13.0%
291246
 
8.7%
n 289962
 
8.6%
D 146277
 
4.4%
c 146277
 
4.4%
g 146277
 
4.4%
r 146277
 
4.4%
d 146277
 
4.4%
i 146277
 
4.4%
Other values (11) 728801
21.7%

verbatimSRS
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing455237
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:24.166760image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length11
Min length10

Characters and Unicode

Total characters33
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row1975-09-28
2nd rowOpistognathus
3rd rowLythrypnus
ValueCountFrequency (%)
1975-09-28 1
33.3%
opistognathus 1
33.3%
lythrypnus 1
33.3%
2025-01-14T11:51:24.331786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 3
 
9.1%
t 3
 
9.1%
y 2
 
6.1%
- 2
 
6.1%
u 2
 
6.1%
h 2
 
6.1%
n 2
 
6.1%
p 2
 
6.1%
9 2
 
6.1%
o 1
 
3.0%
Other values (12) 12
36.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21
63.6%
Decimal Number 8
 
24.2%
Dash Punctuation 2
 
6.1%
Uppercase Letter 2
 
6.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 3
14.3%
t 3
14.3%
y 2
9.5%
u 2
9.5%
h 2
9.5%
n 2
9.5%
p 2
9.5%
o 1
 
4.8%
a 1
 
4.8%
g 1
 
4.8%
Other values (2) 2
9.5%
Decimal Number
ValueCountFrequency (%)
9 2
25.0%
1 1
12.5%
8 1
12.5%
2 1
12.5%
0 1
12.5%
5 1
12.5%
7 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
L 1
50.0%
O 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23
69.7%
Common 10
30.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 3
13.0%
t 3
13.0%
y 2
8.7%
u 2
8.7%
h 2
8.7%
n 2
8.7%
p 2
8.7%
o 1
 
4.3%
L 1
 
4.3%
a 1
 
4.3%
Other values (4) 4
17.4%
Common
ValueCountFrequency (%)
- 2
20.0%
9 2
20.0%
1 1
10.0%
8 1
10.0%
2 1
10.0%
0 1
10.0%
5 1
10.0%
7 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 3
 
9.1%
t 3
 
9.1%
y 2
 
6.1%
- 2
 
6.1%
u 2
 
6.1%
h 2
 
6.1%
n 2
 
6.1%
p 2
 
6.1%
9 2
 
6.1%
o 1
 
3.0%
Other values (12) 12
36.4%

footprintSRS
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455239
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:24.377791image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row271
ValueCountFrequency (%)
271 1
100.0%
2025-01-14T11:51:24.482105image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 1
33.3%
7 1
33.3%
1 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1
33.3%
7 1
33.3%
1 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 1
33.3%
7 1
33.3%
1 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 1
33.3%
7 1
33.3%
1 1
33.3%

footprintSpatialFit
Text

Missing 

Distinct8
Distinct (%)100.0%
Missing455232
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:24.547564image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length17.5
Mean length16.375
Min length3

Characters and Unicode

Total characters131
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)100.0%

Sample

1st rowNoturus nocturnus
2nd row271
3rd rowThalassoma lunare
4th rowBrycon falcatus
5th rowPseudotropheus elongatus
ValueCountFrequency (%)
noturus 1
 
6.7%
nocturnus 1
 
6.7%
271 1
 
6.7%
thalassoma 1
 
6.7%
lunare 1
 
6.7%
brycon 1
 
6.7%
falcatus 1
 
6.7%
pseudotropheus 1
 
6.7%
elongatus 1
 
6.7%
halieutaea 1
 
6.7%
Other values (5) 5
33.3%
2025-01-14T11:51:24.669788image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 14
 
10.7%
a 14
 
10.7%
u 13
 
9.9%
r 9
 
6.9%
e 9
 
6.9%
o 9
 
6.9%
l 7
 
5.3%
7
 
5.3%
c 7
 
5.3%
t 6
 
4.6%
Other values (20) 36
27.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 114
87.0%
Space Separator 7
 
5.3%
Uppercase Letter 7
 
5.3%
Decimal Number 3
 
2.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 14
12.3%
a 14
12.3%
u 13
11.4%
r 9
7.9%
e 9
7.9%
o 9
7.9%
l 7
 
6.1%
c 7
 
6.1%
t 6
 
5.3%
n 5
 
4.4%
Other values (10) 21
18.4%
Uppercase Letter
ValueCountFrequency (%)
B 2
28.6%
H 1
14.3%
P 1
14.3%
N 1
14.3%
T 1
14.3%
S 1
14.3%
Decimal Number
ValueCountFrequency (%)
1 1
33.3%
7 1
33.3%
2 1
33.3%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 121
92.4%
Common 10
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 14
11.6%
a 14
11.6%
u 13
10.7%
r 9
 
7.4%
e 9
 
7.4%
o 9
 
7.4%
l 7
 
5.8%
c 7
 
5.8%
t 6
 
5.0%
n 5
 
4.1%
Other values (16) 28
23.1%
Common
ValueCountFrequency (%)
7
70.0%
1 1
 
10.0%
7 1
 
10.0%
2 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 14
 
10.7%
a 14
 
10.7%
u 13
 
9.9%
r 9
 
6.9%
e 9
 
6.9%
o 9
 
6.9%
l 7
 
5.3%
7
 
5.3%
c 7
 
5.3%
t 6
 
4.6%
Other values (20) 36
27.5%

georeferencedBy
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing455238
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:24.719327image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length6.5
Mean length6.5
Min length4

Characters and Unicode

Total characters13
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row1975
2nd rowaurifrons
ValueCountFrequency (%)
1975 1
50.0%
aurifrons 1
50.0%
2025-01-14T11:51:24.821770image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 2
15.4%
1 1
7.7%
9 1
7.7%
7 1
7.7%
5 1
7.7%
a 1
7.7%
u 1
7.7%
i 1
7.7%
f 1
7.7%
o 1
7.7%
Other values (2) 2
15.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
69.2%
Decimal Number 4
30.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 2
22.2%
a 1
11.1%
u 1
11.1%
i 1
11.1%
f 1
11.1%
o 1
11.1%
n 1
11.1%
s 1
11.1%
Decimal Number
ValueCountFrequency (%)
1 1
25.0%
9 1
25.0%
7 1
25.0%
5 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
69.2%
Common 4
30.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 2
22.2%
a 1
11.1%
u 1
11.1%
i 1
11.1%
f 1
11.1%
o 1
11.1%
n 1
11.1%
s 1
11.1%
Common
ValueCountFrequency (%)
1 1
25.0%
9 1
25.0%
7 1
25.0%
5 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 2
15.4%
1 1
7.7%
9 1
7.7%
7 1
7.7%
5 1
7.7%
a 1
7.7%
u 1
7.7%
i 1
7.7%
f 1
7.7%
o 1
7.7%
Other values (2) 2
15.4%

georeferencedDate
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455239
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:24.865831image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row9
ValueCountFrequency (%)
9 1
100.0%
2025-01-14T11:51:24.970745image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 1
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 1
100.0%

georeferenceProtocol
Text

Missing 

Distinct17
Distinct (%)0.1%
Missing437858
Missing (%)96.2%
Memory size3.5 MiB
2025-01-14T11:51:25.053702image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length125
Median length96
Mean length19.25670234
Min length2

Characters and Unicode

Total characters334720
Distinct characters62
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowGPS
2nd rowOn-line Gazetteer
3rd rowDifferential GPS
4th rowGuide to Best Practices for Georeferencing. (Chapman and Wieczorek, eds. 2006). Google Earth Pro
5th rowChart
ValueCountFrequency (%)
chart 6339
 
11.9%
gps 6319
 
11.9%
google 3627
 
6.8%
earth 3256
 
6.1%
georeferencing 2448
 
4.6%
and 2426
 
4.6%
wieczorek 2399
 
4.5%
pro 2399
 
4.5%
to 2399
 
4.5%
eds 2399
 
4.5%
Other values (38) 19230
36.1%
2025-01-14T11:51:25.219092image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35859
 
10.7%
e 34001
 
10.2%
r 26699
 
8.0%
a 21993
 
6.6%
t 20226
 
6.0%
o 19852
 
5.9%
G 16343
 
4.9%
n 13437
 
4.0%
h 12548
 
3.7%
i 12159
 
3.6%
Other values (52) 121603
36.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 217719
65.0%
Uppercase Letter 53907
 
16.1%
Space Separator 35859
 
10.7%
Other Punctuation 10427
 
3.1%
Decimal Number 10341
 
3.1%
Close Punctuation 2475
 
0.7%
Open Punctuation 2475
 
0.7%
Dash Punctuation 1235
 
0.4%
Math Symbol 282
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 34001
15.6%
r 26699
12.3%
a 21993
10.1%
t 20226
9.3%
o 19852
9.1%
n 13437
 
6.2%
h 12548
 
5.8%
i 12159
 
5.6%
c 10049
 
4.6%
s 7688
 
3.5%
Other values (15) 39067
17.9%
Uppercase Letter
ValueCountFrequency (%)
G 16343
30.3%
P 11269
20.9%
C 8787
16.3%
S 6379
 
11.8%
E 3538
 
6.6%
W 2399
 
4.5%
B 2399
 
4.5%
O 1228
 
2.3%
M 344
 
0.6%
R 331
 
0.6%
Other values (9) 890
 
1.7%
Decimal Number
ValueCountFrequency (%)
0 5063
49.0%
2 2574
24.9%
6 2399
23.2%
1 179
 
1.7%
3 49
 
0.5%
5 49
 
0.5%
4 27
 
0.3%
8 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 7420
71.2%
, 2502
 
24.0%
; 206
 
2.0%
/ 201
 
1.9%
: 98
 
0.9%
Space Separator
ValueCountFrequency (%)
35859
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2475
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2475
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1235
100.0%
Math Symbol
ValueCountFrequency (%)
+ 282
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 271626
81.2%
Common 63094
 
18.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 34001
12.5%
r 26699
 
9.8%
a 21993
 
8.1%
t 20226
 
7.4%
o 19852
 
7.3%
G 16343
 
6.0%
n 13437
 
4.9%
h 12548
 
4.6%
i 12159
 
4.5%
P 11269
 
4.1%
Other values (34) 83099
30.6%
Common
ValueCountFrequency (%)
35859
56.8%
. 7420
 
11.8%
0 5063
 
8.0%
2 2574
 
4.1%
, 2502
 
4.0%
) 2475
 
3.9%
( 2475
 
3.9%
6 2399
 
3.8%
- 1235
 
2.0%
+ 282
 
0.4%
Other values (8) 810
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 334720
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
35859
 
10.7%
e 34001
 
10.2%
r 26699
 
8.0%
a 21993
 
6.6%
t 20226
 
6.0%
o 19852
 
5.9%
G 16343
 
4.9%
n 13437
 
4.0%
h 12548
 
3.7%
i 12159
 
3.6%
Other values (52) 121603
36.3%

georeferenceRemarks
Text

Missing 

Distinct135
Distinct (%)0.6%
Missing432224
Missing (%)94.9%
Memory size3.5 MiB
2025-01-14T11:51:25.836280image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length158
Median length2
Mean length7.225799444
Min length1

Characters and Unicode

Total characters166309
Distinct characters60
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)0.3%

Sample

1st rowStart; End
2nd rowca
3rd rowCA
4th rowCA
5th rowCA
ValueCountFrequency (%)
ca 18411
46.3%
start 2530
 
6.4%
end 2436
 
6.1%
bank 1768
 
4.4%
flower 1768
 
4.4%
garden 1768
 
4.4%
for 977
 
2.5%
west 940
 
2.4%
east 828
 
2.1%
coordinates 580
 
1.5%
Other values (263) 7789
19.6%
2025-01-14T11:51:26.063666image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 17103
 
10.3%
16779
 
10.1%
A 16548
 
10.0%
a 12099
 
7.3%
t 11571
 
7.0%
n 10340
 
6.2%
e 9905
 
6.0%
r 8710
 
5.2%
o 7884
 
4.7%
d 6230
 
3.7%
Other values (50) 49140
29.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 93445
56.2%
Uppercase Letter 47865
28.8%
Space Separator 16779
 
10.1%
Other Punctuation 6140
 
3.7%
Decimal Number 2011
 
1.2%
Dash Punctuation 65
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12099
12.9%
t 11571
12.4%
n 10340
11.1%
e 9905
10.6%
r 8710
9.3%
o 7884
8.4%
d 6230
6.7%
l 5770
6.2%
i 4482
 
4.8%
c 4310
 
4.6%
Other values (13) 12144
13.0%
Uppercase Letter
ValueCountFrequency (%)
C 17103
35.7%
A 16548
34.6%
E 3265
 
6.8%
S 2817
 
5.9%
G 2363
 
4.9%
B 1987
 
4.2%
F 1797
 
3.8%
W 1100
 
2.3%
O 216
 
0.5%
T 165
 
0.3%
Other values (7) 504
 
1.1%
Decimal Number
ValueCountFrequency (%)
1 400
19.9%
3 335
16.7%
6 216
10.7%
9 198
9.8%
8 196
9.7%
4 177
8.8%
2 164
8.2%
5 128
 
6.4%
0 123
 
6.1%
7 74
 
3.7%
Other Punctuation
ValueCountFrequency (%)
; 4208
68.5%
. 1334
 
21.7%
, 592
 
9.6%
" 4
 
0.1%
/ 1
 
< 0.1%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
16779
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 65
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 141310
85.0%
Common 24999
 
15.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 17103
12.1%
A 16548
11.7%
a 12099
 
8.6%
t 11571
 
8.2%
n 10340
 
7.3%
e 9905
 
7.0%
r 8710
 
6.2%
o 7884
 
5.6%
d 6230
 
4.4%
l 5770
 
4.1%
Other values (30) 35150
24.9%
Common
ValueCountFrequency (%)
16779
67.1%
; 4208
 
16.8%
. 1334
 
5.3%
, 592
 
2.4%
1 400
 
1.6%
3 335
 
1.3%
6 216
 
0.9%
9 198
 
0.8%
8 196
 
0.8%
4 177
 
0.7%
Other values (10) 564
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 166309
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 17103
 
10.3%
16779
 
10.1%
A 16548
 
10.0%
a 12099
 
7.3%
t 11571
 
7.0%
n 10340
 
6.2%
e 9905
 
6.0%
r 8710
 
5.2%
o 7884
 
4.7%
d 6230
 
3.7%
Other values (50) 49140
29.5%
Distinct7
Distinct (%)100.0%
Missing455233
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:26.144831image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length141
Median length134
Mean length126.7142857
Min length114

Characters and Unicode

Total characters887
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Ostariophysi, Siluriformes, Ictaluridae
2nd rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Labroidei, Labridae
3rd rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Ostariophysi, Characiformes, Characidae
4th rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Labroidei, Cichlidae
5th rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Paracanthopterygii, Lophiiformes, Ogcocephalioidei, Ogcocephalidae
ValueCountFrequency (%)
animalia 7
10.1%
vertebrata 7
10.1%
osteichthyes 7
10.1%
actinopterygii 7
10.1%
neopterygii 7
10.1%
chordata 7
10.1%
acanthopterygii 4
 
5.8%
perciformes 3
 
4.3%
labroidei 3
 
4.3%
ostariophysi 2
 
2.9%
Other values (15) 15
21.7%
2025-01-14T11:51:26.278982image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 101
 
11.4%
e 82
 
9.2%
a 73
 
8.2%
t 73
 
8.2%
r 66
 
7.4%
, 62
 
7.0%
62
 
7.0%
o 45
 
5.1%
h 34
 
3.8%
c 33
 
3.7%
Other values (21) 256
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 694
78.2%
Uppercase Letter 69
 
7.8%
Other Punctuation 62
 
7.0%
Space Separator 62
 
7.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 101
14.6%
e 82
11.8%
a 73
10.5%
t 73
10.5%
r 66
9.5%
o 45
 
6.5%
h 34
 
4.9%
c 33
 
4.8%
y 28
 
4.0%
p 26
 
3.7%
Other values (9) 133
19.2%
Uppercase Letter
ValueCountFrequency (%)
A 18
26.1%
O 11
15.9%
C 11
15.9%
V 7
 
10.1%
N 7
 
10.1%
L 5
 
7.2%
S 4
 
5.8%
P 4
 
5.8%
I 1
 
1.4%
H 1
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 62
100.0%
Space Separator
ValueCountFrequency (%)
62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 763
86.0%
Common 124
 
14.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 101
13.2%
e 82
10.7%
a 73
 
9.6%
t 73
 
9.6%
r 66
 
8.7%
o 45
 
5.9%
h 34
 
4.5%
c 33
 
4.3%
y 28
 
3.7%
p 26
 
3.4%
Other values (19) 202
26.5%
Common
ValueCountFrequency (%)
, 62
50.0%
62
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 887
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 101
 
11.4%
e 82
 
9.2%
a 73
 
8.2%
t 73
 
8.2%
r 66
 
7.4%
, 62
 
7.0%
62
 
7.0%
o 45
 
5.1%
h 34
 
3.8%
c 33
 
3.7%
Other values (21) 256
28.9%

latestEonOrHighestEonothem
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455233
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:26.326872image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters56
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 7
100.0%
2025-01-14T11:51:26.426473image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 14
25.0%
a 14
25.0%
A 7
12.5%
n 7
12.5%
m 7
12.5%
l 7
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 49
87.5%
Uppercase Letter 7
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 14
28.6%
a 14
28.6%
n 7
14.3%
m 7
14.3%
l 7
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 56
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 14
25.0%
a 14
25.0%
A 7
12.5%
n 7
12.5%
m 7
12.5%
l 7
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 14
25.0%
a 14
25.0%
A 7
12.5%
n 7
12.5%
m 7
12.5%
l 7
12.5%

earliestEraOrLowestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455233
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:26.469555image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters56
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChordata
2nd rowChordata
3rd rowChordata
4th rowChordata
5th rowChordata
ValueCountFrequency (%)
chordata 7
100.0%
2025-01-14T11:51:26.578429image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 14
25.0%
C 7
12.5%
h 7
12.5%
o 7
12.5%
r 7
12.5%
d 7
12.5%
t 7
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 49
87.5%
Uppercase Letter 7
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 14
28.6%
h 7
14.3%
o 7
14.3%
r 7
14.3%
d 7
14.3%
t 7
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 56
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 14
25.0%
C 7
12.5%
h 7
12.5%
o 7
12.5%
r 7
12.5%
d 7
12.5%
t 7
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 14
25.0%
C 7
12.5%
h 7
12.5%
o 7
12.5%
r 7
12.5%
d 7
12.5%
t 7
12.5%

latestEraOrHighestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)14.3%
Missing455233
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:26.629459image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters98
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowActinopterygii
2nd rowActinopterygii
3rd rowActinopterygii
4th rowActinopterygii
5th rowActinopterygii
ValueCountFrequency (%)
actinopterygii 7
100.0%
2025-01-14T11:51:26.757622image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 21
21.4%
t 14
14.3%
A 7
 
7.1%
c 7
 
7.1%
n 7
 
7.1%
o 7
 
7.1%
p 7
 
7.1%
e 7
 
7.1%
r 7
 
7.1%
y 7
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 91
92.9%
Uppercase Letter 7
 
7.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 21
23.1%
t 14
15.4%
c 7
 
7.7%
n 7
 
7.7%
o 7
 
7.7%
p 7
 
7.7%
e 7
 
7.7%
r 7
 
7.7%
y 7
 
7.7%
g 7
 
7.7%
Uppercase Letter
ValueCountFrequency (%)
A 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 98
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 21
21.4%
t 14
14.3%
A 7
 
7.1%
c 7
 
7.1%
n 7
 
7.1%
o 7
 
7.1%
p 7
 
7.1%
e 7
 
7.1%
r 7
 
7.1%
y 7
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 98
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 21
21.4%
t 14
14.3%
A 7
 
7.1%
c 7
 
7.1%
n 7
 
7.1%
o 7
 
7.1%
p 7
 
7.1%
e 7
 
7.1%
r 7
 
7.1%
y 7
 
7.1%
Distinct5
Distinct (%)71.4%
Missing455233
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:26.809150image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length13
Mean length12.14285714
Min length11

Characters and Unicode

Total characters85
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)57.1%

Sample

1st rowSiluriformes
2nd rowPerciformes
3rd rowCharaciformes
4th rowPerciformes
5th rowLophiiformes
ValueCountFrequency (%)
perciformes 3
42.9%
siluriformes 1
 
14.3%
characiformes 1
 
14.3%
lophiiformes 1
 
14.3%
scorpaeniformes 1
 
14.3%
2025-01-14T11:51:26.926197image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 13
15.3%
e 11
12.9%
i 9
10.6%
o 9
10.6%
f 7
8.2%
m 7
8.2%
s 7
8.2%
c 5
 
5.9%
P 3
 
3.5%
a 3
 
3.5%
Other values (8) 11
12.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 78
91.8%
Uppercase Letter 7
 
8.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 13
16.7%
e 11
14.1%
i 9
11.5%
o 9
11.5%
f 7
9.0%
m 7
9.0%
s 7
9.0%
c 5
 
6.4%
a 3
 
3.8%
h 2
 
2.6%
Other values (4) 5
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
P 3
42.9%
S 2
28.6%
C 1
 
14.3%
L 1
 
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 85
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 13
15.3%
e 11
12.9%
i 9
10.6%
o 9
10.6%
f 7
8.2%
m 7
8.2%
s 7
8.2%
c 5
 
5.9%
P 3
 
3.5%
a 3
 
3.5%
Other values (8) 11
12.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 85
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 13
15.3%
e 11
12.9%
i 9
10.6%
o 9
10.6%
f 7
8.2%
m 7
8.2%
s 7
8.2%
c 5
 
5.9%
P 3
 
3.5%
a 3
 
3.5%
Other values (8) 11
12.9%
Distinct7
Distinct (%)100.0%
Missing455233
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:26.987177image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length11
Mean length10.71428571
Min length8

Characters and Unicode

Total characters75
Distinct characters21
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowIctaluridae
2nd rowLabridae
3rd rowCharacidae
4th rowCichlidae
5th rowOgcocephalidae
ValueCountFrequency (%)
ictaluridae 1
14.3%
labridae 1
14.3%
characidae 1
14.3%
cichlidae 1
14.3%
ogcocephalidae 1
14.3%
scaridae 1
14.3%
hemitripteridae 1
14.3%
2025-01-14T11:51:27.127280image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 13
17.3%
i 10
13.3%
e 10
13.3%
d 7
9.3%
r 6
8.0%
c 6
8.0%
t 3
 
4.0%
l 3
 
4.0%
h 3
 
4.0%
C 2
 
2.7%
Other values (11) 12
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 68
90.7%
Uppercase Letter 7
 
9.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 13
19.1%
i 10
14.7%
e 10
14.7%
d 7
10.3%
r 6
8.8%
c 6
8.8%
t 3
 
4.4%
l 3
 
4.4%
h 3
 
4.4%
p 2
 
2.9%
Other values (5) 5
 
7.4%
Uppercase Letter
ValueCountFrequency (%)
C 2
28.6%
I 1
14.3%
H 1
14.3%
S 1
14.3%
L 1
14.3%
O 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 75
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 13
17.3%
i 10
13.3%
e 10
13.3%
d 7
9.3%
r 6
8.0%
c 6
8.0%
t 3
 
4.0%
l 3
 
4.0%
h 3
 
4.0%
C 2
 
2.7%
Other values (11) 12
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 75
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 13
17.3%
i 10
13.3%
e 10
13.3%
d 7
9.3%
r 6
8.0%
c 6
8.0%
t 3
 
4.0%
l 3
 
4.0%
h 3
 
4.0%
C 2
 
2.7%
Other values (11) 12
16.0%

latestEpochOrHighestSeries
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455239
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:27.193409image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length43
Mean length43
Min length43

Characters and Unicode

Total characters43
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowIndian Ocean, Red Sea, Sudan, Red Sea State
ValueCountFrequency (%)
red 2
25.0%
sea 2
25.0%
indian 1
12.5%
ocean 1
12.5%
sudan 1
12.5%
state 1
12.5%
2025-01-14T11:51:27.307577image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
16.3%
a 6
14.0%
e 6
14.0%
n 4
9.3%
d 4
9.3%
S 4
9.3%
, 3
7.0%
R 2
 
4.7%
t 2
 
4.7%
I 1
 
2.3%
Other values (4) 4
9.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25
58.1%
Uppercase Letter 8
 
18.6%
Space Separator 7
 
16.3%
Other Punctuation 3
 
7.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
24.0%
e 6
24.0%
n 4
16.0%
d 4
16.0%
t 2
 
8.0%
i 1
 
4.0%
c 1
 
4.0%
u 1
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
S 4
50.0%
R 2
25.0%
I 1
 
12.5%
O 1
 
12.5%
Space Separator
ValueCountFrequency (%)
7
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 33
76.7%
Common 10
 
23.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
18.2%
e 6
18.2%
n 4
12.1%
d 4
12.1%
S 4
12.1%
R 2
 
6.1%
t 2
 
6.1%
I 1
 
3.0%
i 1
 
3.0%
O 1
 
3.0%
Other values (2) 2
 
6.1%
Common
ValueCountFrequency (%)
7
70.0%
, 3
30.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7
16.3%
a 6
14.0%
e 6
14.0%
n 4
9.3%
d 4
9.3%
S 4
9.3%
, 3
7.0%
R 2
 
4.7%
t 2
 
4.7%
I 1
 
2.3%
Other values (4) 4
9.3%

earliestAgeOrLowestStage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455239
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:27.354171image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters12
Distinct characters9
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowIndian Ocean
ValueCountFrequency (%)
indian 1
50.0%
ocean 1
50.0%
2025-01-14T11:51:27.451581image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 3
25.0%
a 2
16.7%
I 1
 
8.3%
d 1
 
8.3%
i 1
 
8.3%
1
 
8.3%
O 1
 
8.3%
c 1
 
8.3%
e 1
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9
75.0%
Uppercase Letter 2
 
16.7%
Space Separator 1
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 3
33.3%
a 2
22.2%
d 1
 
11.1%
i 1
 
11.1%
c 1
 
11.1%
e 1
 
11.1%
Uppercase Letter
ValueCountFrequency (%)
I 1
50.0%
O 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11
91.7%
Common 1
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 3
27.3%
a 2
18.2%
I 1
 
9.1%
d 1
 
9.1%
i 1
 
9.1%
O 1
 
9.1%
c 1
 
9.1%
e 1
 
9.1%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 3
25.0%
a 2
16.7%
I 1
 
8.3%
d 1
 
8.3%
i 1
 
8.3%
1
 
8.3%
O 1
 
8.3%
c 1
 
8.3%
e 1
 
8.3%

latestAgeOrHighestStage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455239
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:27.501113image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length21
Mean length21
Min length21

Characters and Unicode

Total characters21
Distinct characters12
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowIndian Ocean, Red Sea
ValueCountFrequency (%)
indian 1
25.0%
ocean 1
25.0%
red 1
25.0%
sea 1
25.0%
2025-01-14T11:51:27.606837image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 3
14.3%
a 3
14.3%
3
14.3%
e 3
14.3%
d 2
9.5%
I 1
 
4.8%
i 1
 
4.8%
O 1
 
4.8%
c 1
 
4.8%
, 1
 
4.8%
Other values (2) 2
9.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13
61.9%
Uppercase Letter 4
 
19.0%
Space Separator 3
 
14.3%
Other Punctuation 1
 
4.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 3
23.1%
a 3
23.1%
e 3
23.1%
d 2
15.4%
i 1
 
7.7%
c 1
 
7.7%
Uppercase Letter
ValueCountFrequency (%)
I 1
25.0%
O 1
25.0%
R 1
25.0%
S 1
25.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17
81.0%
Common 4
 
19.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 3
17.6%
a 3
17.6%
e 3
17.6%
d 2
11.8%
I 1
 
5.9%
i 1
 
5.9%
O 1
 
5.9%
c 1
 
5.9%
R 1
 
5.9%
S 1
 
5.9%
Common
ValueCountFrequency (%)
3
75.0%
, 1
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 3
14.3%
a 3
14.3%
3
14.3%
e 3
14.3%
d 2
9.5%
I 1
 
4.8%
i 1
 
4.8%
O 1
 
4.8%
c 1
 
4.8%
, 1
 
4.8%
Other values (2) 2
9.5%
Distinct7
Distinct (%)100.0%
Missing455233
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:27.671983image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length10
Mean length8.714285714
Min length6

Characters and Unicode

Total characters61
Distinct characters22
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowNoturus
2nd rowThalassoma
3rd rowBrycon
4th rowPseudotropheus
5th rowHalieutaea
ValueCountFrequency (%)
noturus 1
14.3%
thalassoma 1
14.3%
brycon 1
14.3%
pseudotropheus 1
14.3%
halieutaea 1
14.3%
scarus 1
14.3%
blepsias 1
14.3%
2025-01-14T11:51:27.844980image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 8
13.1%
a 8
13.1%
u 6
 
9.8%
e 5
 
8.2%
o 5
 
8.2%
r 4
 
6.6%
t 3
 
4.9%
l 3
 
4.9%
B 2
 
3.3%
i 2
 
3.3%
Other values (12) 15
24.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 54
88.5%
Uppercase Letter 7
 
11.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 8
14.8%
a 8
14.8%
u 6
11.1%
e 5
9.3%
o 5
9.3%
r 4
7.4%
t 3
 
5.6%
l 3
 
5.6%
i 2
 
3.7%
h 2
 
3.7%
Other values (6) 8
14.8%
Uppercase Letter
ValueCountFrequency (%)
B 2
28.6%
H 1
14.3%
N 1
14.3%
P 1
14.3%
T 1
14.3%
S 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 61
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 8
13.1%
a 8
13.1%
u 6
 
9.8%
e 5
 
8.2%
o 5
 
8.2%
r 4
 
6.6%
t 3
 
4.9%
l 3
 
4.9%
B 2
 
3.3%
i 2
 
3.3%
Other values (12) 15
24.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 61
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 8
13.1%
a 8
13.1%
u 6
 
9.8%
e 5
 
8.2%
o 5
 
8.2%
r 4
 
6.6%
t 3
 
4.9%
l 3
 
4.9%
B 2
 
3.3%
i 2
 
3.3%
Other values (12) 15
24.6%

lithostratigraphicTerms
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455239
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:27.901031image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowSudan
ValueCountFrequency (%)
sudan 1
100.0%
2025-01-14T11:51:28.018652image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 1
20.0%
u 1
20.0%
d 1
20.0%
a 1
20.0%
n 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4
80.0%
Uppercase Letter 1
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 1
25.0%
d 1
25.0%
a 1
25.0%
n 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
S 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1
20.0%
u 1
20.0%
d 1
20.0%
a 1
20.0%
n 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1
20.0%
u 1
20.0%
d 1
20.0%
a 1
20.0%
n 1
20.0%

formation
Text

Missing 

Distinct8
Distinct (%)100.0%
Missing455232
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:28.084129image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length11.5
Mean length9.125
Min length6

Characters and Unicode

Total characters73
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)100.0%

Sample

1st rownocturnus
2nd rowRed Sea State
3rd rowlunare
4th rowfalcatus
5th rowelongatus
ValueCountFrequency (%)
nocturnus 1
10.0%
red 1
10.0%
sea 1
10.0%
state 1
10.0%
lunare 1
10.0%
falcatus 1
10.0%
elongatus 1
10.0%
brevicauda 1
10.0%
globiceps 1
10.0%
cirrhosus 1
10.0%
2025-01-14T11:51:28.214763image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
11.0%
u 7
 
9.6%
e 7
 
9.6%
s 6
 
8.2%
c 5
 
6.8%
t 5
 
6.8%
r 5
 
6.8%
n 4
 
5.5%
l 4
 
5.5%
o 4
 
5.5%
Other values (11) 18
24.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 68
93.2%
Uppercase Letter 3
 
4.1%
Space Separator 2
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
11.8%
u 7
10.3%
e 7
10.3%
s 6
8.8%
c 5
 
7.4%
t 5
 
7.4%
r 5
 
7.4%
n 4
 
5.9%
l 4
 
5.9%
o 4
 
5.9%
Other values (8) 13
19.1%
Uppercase Letter
ValueCountFrequency (%)
S 2
66.7%
R 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 71
97.3%
Common 2
 
2.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
11.3%
u 7
9.9%
e 7
9.9%
s 6
 
8.5%
c 5
 
7.0%
t 5
 
7.0%
r 5
 
7.0%
n 4
 
5.6%
l 4
 
5.6%
o 4
 
5.6%
Other values (10) 16
22.5%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
11.0%
u 7
 
9.6%
e 7
 
9.6%
s 6
 
8.2%
c 5
 
6.8%
t 5
 
6.8%
r 5
 
6.8%
n 4
 
5.5%
l 4
 
5.5%
o 4
 
5.5%
Other values (11) 18
24.7%

identificationID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing455239
Missing (%)> 99.9%
Memory size3.5 MiB
2025-01-14T11:51:28.264820image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length21
Mean length21
Min length21

Characters and Unicode

Total characters21
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowMersa Towartit, Sudan
ValueCountFrequency (%)
mersa 1
33.3%
towartit 1
33.3%
sudan 1
33.3%
2025-01-14T11:51:28.375944image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
14.3%
r 2
 
9.5%
2
 
9.5%
t 2
 
9.5%
M 1
 
4.8%
e 1
 
4.8%
s 1
 
4.8%
T 1
 
4.8%
o 1
 
4.8%
w 1
 
4.8%
Other values (6) 6
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15
71.4%
Uppercase Letter 3
 
14.3%
Space Separator 2
 
9.5%
Other Punctuation 1
 
4.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
20.0%
r 2
13.3%
t 2
13.3%
e 1
 
6.7%
s 1
 
6.7%
o 1
 
6.7%
w 1
 
6.7%
i 1
 
6.7%
u 1
 
6.7%
d 1
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
M 1
33.3%
T 1
33.3%
S 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18
85.7%
Common 3
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
16.7%
r 2
11.1%
t 2
11.1%
M 1
 
5.6%
e 1
 
5.6%
s 1
 
5.6%
T 1
 
5.6%
o 1
 
5.6%
w 1
 
5.6%
i 1
 
5.6%
Other values (4) 4
22.2%
Common
ValueCountFrequency (%)
2
66.7%
, 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
14.3%
r 2
 
9.5%
2
 
9.5%
t 2
 
9.5%
M 1
 
4.8%
e 1
 
4.8%
s 1
 
4.8%
T 1
 
4.8%
o 1
 
4.8%
w 1
 
4.8%
Other values (6) 6
28.6%
Distinct6
Distinct (%)0.4%
Missing453543
Missing (%)99.6%
Memory size3.5 MiB
2025-01-14T11:51:28.423965image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length3
Mean length5.786682381
Min length3

Characters and Unicode

Total characters9820
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowcf.
2nd rowuncertain
3rd rowuncertain
4th rowuncertain
5th rownear
ValueCountFrequency (%)
cf 895
52.7%
uncertain 783
46.1%
aff 14
 
0.8%
near 4
 
0.2%
jordan 1
 
0.1%
1
 
0.1%
gilbert 1
 
0.1%
2025-01-14T11:51:28.531873image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 1678
17.1%
n 1571
16.0%
f 923
9.4%
. 909
9.3%
a 802
8.2%
r 789
8.0%
e 788
8.0%
i 784
8.0%
t 784
8.0%
u 652
 
6.6%
Other values (9) 140
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8775
89.4%
Other Punctuation 910
 
9.3%
Uppercase Letter 133
 
1.4%
Space Separator 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 1678
19.1%
n 1571
17.9%
f 923
10.5%
a 802
9.1%
r 789
9.0%
e 788
9.0%
i 784
8.9%
t 784
8.9%
u 652
 
7.4%
o 1
 
< 0.1%
Other values (3) 3
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
U 131
98.5%
J 1
 
0.8%
G 1
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 909
99.9%
& 1
 
0.1%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8908
90.7%
Common 912
 
9.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 1678
18.8%
n 1571
17.6%
f 923
10.4%
a 802
9.0%
r 789
8.9%
e 788
8.8%
i 784
8.8%
t 784
8.8%
u 652
 
7.3%
U 131
 
1.5%
Other values (6) 6
 
0.1%
Common
ValueCountFrequency (%)
. 909
99.7%
2
 
0.2%
& 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9820
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 1678
17.1%
n 1571
16.0%
f 923
9.4%
. 909
9.3%
a 802
8.2%
r 789
8.0%
e 788
8.0%
i 784
8.0%
t 784
8.0%
u 652
 
6.6%
Other values (9) 140
 
1.4%

typeStatus
Text

Missing 

Distinct48
Distinct (%)0.2%
Missing435739
Missing (%)95.7%
Memory size3.5 MiB
2025-01-14T11:51:28.599691image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length8
Mean length8.129172863
Min length2

Characters and Unicode

Total characters158527
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)0.1%

Sample

1st rowParatype
2nd rowHolotype
3rd rowParalectotype; Syntype
4th rowParatype
5th rowParatype
ValueCountFrequency (%)
paratype 12481
61.7%
holotype 3366
 
16.6%
type 1497
 
7.4%
syntype 1443
 
7.1%
paralectotype 655
 
3.2%
lectotype 330
 
1.6%
cotype 310
 
1.5%
neotype 68
 
0.3%
ms 39
 
0.2%
unconfirmed 16
 
0.1%
Other values (2) 18
 
0.1%
2025-01-14T11:51:28.746151image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 26288
16.6%
y 21611
13.6%
e 21237
13.4%
p 20176
12.7%
t 19664
12.4%
r 13160
8.3%
P 13144
8.3%
o 8137
 
5.1%
l 4041
 
2.5%
H 3366
 
2.1%
Other values (17) 7703
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 136893
86.4%
Uppercase Letter 20223
 
12.8%
Space Separator 722
 
0.5%
Other Punctuation 689
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 26288
19.2%
y 21611
15.8%
e 21237
15.5%
p 20176
14.7%
t 19664
14.4%
r 13160
9.6%
o 8137
 
5.9%
l 4041
 
3.0%
n 1475
 
1.1%
c 1001
 
0.7%
Other values (5) 103
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
P 13144
65.0%
H 3366
 
16.6%
T 1497
 
7.4%
S 1443
 
7.1%
L 330
 
1.6%
C 310
 
1.5%
N 68
 
0.3%
M 39
 
0.2%
U 16
 
0.1%
A 10
 
< 0.1%
Space Separator
ValueCountFrequency (%)
722
100.0%
Other Punctuation
ValueCountFrequency (%)
; 689
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 157116
99.1%
Common 1411
 
0.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 26288
16.7%
y 21611
13.8%
e 21237
13.5%
p 20176
12.8%
t 19664
12.5%
r 13160
8.4%
P 13144
8.4%
o 8137
 
5.2%
l 4041
 
2.6%
H 3366
 
2.1%
Other values (15) 6292
 
4.0%
Common
ValueCountFrequency (%)
722
51.2%
; 689
48.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 158527
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 26288
16.6%
y 21611
13.6%
e 21237
13.4%
p 20176
12.7%
t 19664
12.4%
r 13160
8.3%
P 13144
8.3%
o 8137
 
5.1%
l 4041
 
2.5%
H 3366
 
2.1%
Other values (17) 7703
 
4.9%

identifiedBy
Text

Missing 

Distinct572
Distinct (%)1.7%
Missing421099
Missing (%)92.5%
Memory size3.5 MiB
2025-01-14T11:51:28.955793image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length147
Median length137
Mean length21.138836
Min length5

Characters and Unicode

Total characters721701
Distinct characters69
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique143 ?
Unique (%)0.4%

Sample

1st rowPezold, Frank; Larson, Helen K.
2nd rowWilliams, Jeffrey T.
3rd rowWilliams, Jeffrey T.
4th rowEschmeyer, William N.
5th rowKarnella, Susan J.
ValueCountFrequency (%)
williams 6495
 
5.8%
jeffrey 6367
 
5.7%
t 6366
 
5.7%
e 4377
 
3.9%
david 4213
 
3.8%
g 4044
 
3.6%
smith 3785
 
3.4%
c 2656
 
2.4%
pitassy 2527
 
2.3%
diane 2527
 
2.3%
Other values (967) 68438
61.2%
2025-01-14T11:51:29.317287image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
77654
 
10.8%
a 55235
 
7.7%
i 54046
 
7.5%
e 50215
 
7.0%
, 37808
 
5.2%
r 34029
 
4.7%
l 32102
 
4.4%
n 30825
 
4.3%
. 26918
 
3.7%
t 26856
 
3.7%
Other values (59) 296013
41.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 443225
61.4%
Uppercase Letter 127592
 
17.7%
Space Separator 77654
 
10.8%
Other Punctuation 66175
 
9.2%
Dash Punctuation 2357
 
0.3%
Close Punctuation 2272
 
0.3%
Open Punctuation 2272
 
0.3%
Final Punctuation 77
 
< 0.1%
Initial Punctuation 77
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 55235
12.5%
i 54046
12.2%
e 50215
11.3%
r 34029
 
7.7%
l 32102
 
7.2%
n 30825
 
7.0%
t 26856
 
6.1%
s 22672
 
5.1%
o 22269
 
5.0%
m 19827
 
4.5%
Other values (21) 95149
21.5%
Uppercase Letter
ValueCountFrequency (%)
T 11485
 
9.0%
D 10254
 
8.0%
J 9552
 
7.5%
S 9476
 
7.4%
W 9319
 
7.3%
C 8939
 
7.0%
E 7479
 
5.9%
A 6920
 
5.4%
H 6037
 
4.7%
G 5572
 
4.4%
Other values (16) 42559
33.4%
Other Punctuation
ValueCountFrequency (%)
, 37808
57.1%
. 26918
40.7%
; 1042
 
1.6%
/ 387
 
0.6%
' 18
 
< 0.1%
& 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
77654
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2357
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2272
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2272
100.0%
Final Punctuation
ValueCountFrequency (%)
77
100.0%
Initial Punctuation
ValueCountFrequency (%)
77
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 570817
79.1%
Common 150884
 
20.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 55235
 
9.7%
i 54046
 
9.5%
e 50215
 
8.8%
r 34029
 
6.0%
l 32102
 
5.6%
n 30825
 
5.4%
t 26856
 
4.7%
s 22672
 
4.0%
o 22269
 
3.9%
m 19827
 
3.5%
Other values (47) 222741
39.0%
Common
ValueCountFrequency (%)
77654
51.5%
, 37808
25.1%
. 26918
 
17.8%
- 2357
 
1.6%
) 2272
 
1.5%
( 2272
 
1.5%
; 1042
 
0.7%
/ 387
 
0.3%
77
 
0.1%
77
 
0.1%
Other values (2) 20
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 721502
> 99.9%
Punctuation 154
 
< 0.1%
None 45
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
77654
 
10.8%
a 55235
 
7.7%
i 54046
 
7.5%
e 50215
 
7.0%
, 37808
 
5.2%
r 34029
 
4.7%
l 32102
 
4.4%
n 30825
 
4.3%
. 26918
 
3.7%
t 26856
 
3.7%
Other values (51) 295814
41.0%
Punctuation
ValueCountFrequency (%)
77
50.0%
77
50.0%
None
ValueCountFrequency (%)
ñ 31
68.9%
á 6
 
13.3%
Ö 2
 
4.4%
ü 2
 
4.4%
í 2
 
4.4%
ê 2
 
4.4%
Distinct30203
Distinct (%)6.6%
Missing17
Missing (%)< 0.1%
Memory size3.5 MiB
2025-01-14T11:51:29.549907image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length54
Mean length18.57404393
Min length2

Characters and Unicode

Total characters8455332
Distinct characters70
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9326 ?
Unique (%)2.0%

Sample

1st rowEchidna nebulosa
2nd rowMugil
3rd rowCryptocentrus filifer
4th rowRhinichthys cataractae
5th rowCentropomus ensiferus
ValueCountFrequency (%)
notropis 7207
 
0.8%
etheostoma 4890
 
0.6%
chaetodon 4339
 
0.5%
gymnothorax 4324
 
0.5%
lepomis 4275
 
0.5%
lutjanus 3888
 
0.5%
chromis 3151
 
0.4%
halichoeres 3126
 
0.4%
pomacentrus 2957
 
0.3%
acanthurus 2893
 
0.3%
Other values (18883) 813972
95.2%
2025-01-14T11:51:29.835674image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 773370
 
9.1%
a 769408
 
9.1%
i 700894
 
8.3%
o 619010
 
7.3%
e 559465
 
6.6%
u 535571
 
6.3%
r 508544
 
6.0%
t 479240
 
5.7%
n 446308
 
5.3%
399799
 
4.7%
Other values (60) 2663723
31.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7599188
89.9%
Uppercase Letter 455408
 
5.4%
Space Separator 399799
 
4.7%
Close Punctuation 292
 
< 0.1%
Open Punctuation 292
 
< 0.1%
Other Punctuation 152
 
< 0.1%
Dash Punctuation 111
 
< 0.1%
Decimal Number 90
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 773370
10.2%
a 769408
10.1%
i 700894
 
9.2%
o 619010
 
8.1%
e 559465
 
7.4%
u 535571
 
7.0%
r 508544
 
6.7%
t 479240
 
6.3%
n 446308
 
5.9%
l 365551
 
4.8%
Other values (16) 1841827
24.2%
Uppercase Letter
ValueCountFrequency (%)
C 63046
13.8%
P 53487
11.7%
S 52489
11.5%
A 43663
9.6%
E 29491
 
6.5%
L 27982
 
6.1%
M 27368
 
6.0%
H 25638
 
5.6%
G 21345
 
4.7%
N 21144
 
4.6%
Other values (16) 89755
19.7%
Decimal Number
ValueCountFrequency (%)
1 39
43.3%
2 25
27.8%
3 15
 
16.7%
4 3
 
3.3%
6 3
 
3.3%
9 2
 
2.2%
5 2
 
2.2%
7 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 84
55.3%
/ 41
27.0%
? 11
 
7.2%
& 8
 
5.3%
5
 
3.3%
# 3
 
2.0%
Space Separator
ValueCountFrequency (%)
399799
100.0%
Close Punctuation
ValueCountFrequency (%)
) 292
100.0%
Open Punctuation
ValueCountFrequency (%)
( 292
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 111
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8054596
95.3%
Common 400736
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 773370
 
9.6%
a 769408
 
9.6%
i 700894
 
8.7%
o 619010
 
7.7%
e 559465
 
6.9%
u 535571
 
6.6%
r 508544
 
6.3%
t 479240
 
5.9%
n 446308
 
5.5%
l 365551
 
4.5%
Other values (42) 2297235
28.5%
Common
ValueCountFrequency (%)
399799
99.8%
) 292
 
0.1%
( 292
 
0.1%
- 111
 
< 0.1%
. 84
 
< 0.1%
/ 41
 
< 0.1%
1 39
 
< 0.1%
2 25
 
< 0.1%
3 15
 
< 0.1%
? 11
 
< 0.1%
Other values (8) 27
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8455327
> 99.9%
Punctuation 5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 773370
 
9.1%
a 769408
 
9.1%
i 700894
 
8.3%
o 619010
 
7.3%
e 559465
 
6.6%
u 535571
 
6.3%
r 508544
 
6.0%
t 479240
 
5.7%
n 446308
 
5.3%
399799
 
4.7%
Other values (59) 2663718
31.5%
Punctuation
ValueCountFrequency (%)
5
100.0%
Distinct863
Distinct (%)0.2%
Missing241
Missing (%)0.1%
Memory size3.5 MiB
2025-01-14T11:51:30.040459image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length164
Median length155
Mean length131.5153901
Min length8

Characters and Unicode

Total characters59839371
Distinct characters51
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Elopomorpha, Anguilliformes, Muraenoidei, Muraenidae, Muraeninae
2nd rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Percoidei, Mugilidae
3rd rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Gobioidei, Gobiidae, Gobiinae
4th rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Ostariophysi, Cypriniformes, Cyprinidae
5th rowAnimalia, Chordata, Vertebrata, Osteichthyes, Actinopterygii, Neopterygii, Acanthopterygii, Perciformes, Percoidei, Centropomidae
ValueCountFrequency (%)
chordata 454990
 
9.9%
animalia 454946
 
9.9%
vertebrata 454435
 
9.8%
osteichthyes 444540
 
9.6%
actinopterygii 444484
 
9.6%
neopterygii 444050
 
9.6%
acanthopterygii 293106
 
6.4%
perciformes 213819
 
4.6%
percoidei 96931
 
2.1%
ostariophysi 67594
 
1.5%
Other values (969) 1246072
27.0%
2025-01-14T11:51:30.327579image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 6691210
 
11.2%
e 5764979
 
9.6%
t 4862469
 
8.1%
a 4453446
 
7.4%
, 4159968
 
7.0%
4159968
 
7.0%
r 4156335
 
6.9%
o 3437353
 
5.7%
h 2157435
 
3.6%
n 2101379
 
3.5%
Other values (41) 17894829
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 46904468
78.4%
Uppercase Letter 4614967
 
7.7%
Other Punctuation 4159968
 
7.0%
Space Separator 4159968
 
7.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 6691210
14.3%
e 5764979
12.3%
t 4862469
10.4%
a 4453446
9.5%
r 4156335
8.9%
o 3437353
 
7.3%
h 2157435
 
4.6%
n 2101379
 
4.5%
y 1975452
 
4.2%
c 1930243
 
4.1%
Other values (16) 9374167
20.0%
Uppercase Letter
ValueCountFrequency (%)
A 1292577
28.0%
C 704749
15.3%
O 545299
11.8%
V 454509
 
9.8%
N 451703
 
9.8%
P 431330
 
9.3%
S 209844
 
4.5%
G 105664
 
2.3%
L 88683
 
1.9%
B 85581
 
1.9%
Other values (13) 245028
 
5.3%
Other Punctuation
ValueCountFrequency (%)
, 4159968
100.0%
Space Separator
ValueCountFrequency (%)
4159968
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 51519435
86.1%
Common 8319936
 
13.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 6691210
13.0%
e 5764979
11.2%
t 4862469
 
9.4%
a 4453446
 
8.6%
r 4156335
 
8.1%
o 3437353
 
6.7%
h 2157435
 
4.2%
n 2101379
 
4.1%
y 1975452
 
3.8%
c 1930243
 
3.7%
Other values (39) 13989134
27.2%
Common
ValueCountFrequency (%)
, 4159968
50.0%
4159968
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 59839371
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 6691210
 
11.2%
e 5764979
 
9.6%
t 4862469
 
8.1%
a 4453446
 
7.4%
, 4159968
 
7.0%
4159968
 
7.0%
r 4156335
 
6.9%
o 3437353
 
5.7%
h 2157435
 
3.6%
n 2101379
 
3.5%
Other values (41) 17894829
29.9%
Distinct3
Distinct (%)< 0.1%
Missing266
Missing (%)0.1%
Memory size3.5 MiB
2025-01-14T11:51:30.390179image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.999978021
Min length7

Characters and Unicode

Total characters3639782
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 454946
> 99.9%
animalis 18
 
< 0.1%
animala 10
 
< 0.1%
2025-01-14T11:51:30.499801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 909938
25.0%
a 909930
25.0%
A 454974
12.5%
n 454974
12.5%
m 454974
12.5%
l 454974
12.5%
s 18
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3184808
87.5%
Uppercase Letter 454974
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 909938
28.6%
a 909930
28.6%
n 454974
14.3%
m 454974
14.3%
l 454974
14.3%
s 18
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
A 454974
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3639782
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 909938
25.0%
a 909930
25.0%
A 454974
12.5%
n 454974
12.5%
m 454974
12.5%
l 454974
12.5%
s 18
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3639782
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 909938
25.0%
a 909930
25.0%
A 454974
12.5%
n 454974
12.5%
m 454974
12.5%
l 454974
12.5%
s 18
 
< 0.1%

phylum
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing250
Missing (%)0.1%
Memory size3.5 MiB
2025-01-14T11:51:30.557747image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters3639920
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChordata
2nd rowChordata
3rd rowChordata
4th rowChordata
5th rowChordata
ValueCountFrequency (%)
chordata 454990
100.0%
2025-01-14T11:51:30.677282image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 909980
25.0%
C 454990
12.5%
h 454990
12.5%
o 454990
12.5%
r 454990
12.5%
d 454990
12.5%
t 454990
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3184930
87.5%
Uppercase Letter 454990
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 909980
28.6%
h 454990
14.3%
o 454990
14.3%
r 454990
14.3%
d 454990
14.3%
t 454990
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 454990
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3639920
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 909980
25.0%
C 454990
12.5%
h 454990
12.5%
o 454990
12.5%
r 454990
12.5%
d 454990
12.5%
t 454990
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3639920
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 909980
25.0%
C 454990
12.5%
h 454990
12.5%
o 454990
12.5%
r 454990
12.5%
d 454990
12.5%
t 454990
12.5%

class
Text

Distinct7
Distinct (%)< 0.1%
Missing295
Missing (%)0.1%
Memory size3.5 MiB
2025-01-14T11:51:30.733808image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length14
Mean length14.00336744
Min length6

Characters and Unicode

Total characters6370762
Distinct characters23
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowActinopterygii
2nd rowActinopterygii
3rd rowActinopterygii
4th rowActinopterygii
5th rowActinopterygii
ValueCountFrequency (%)
actinopterygii 444484
97.7%
chondrichthyes 9185
 
2.0%
cephalaspidomorphi 565
 
0.1%
cephalochordata 514
 
0.1%
myxini 150
 
< 0.1%
sarcopterygii 42
 
< 0.1%
elasmobranchii 5
 
< 0.1%
2025-01-14T11:51:30.844285image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1344161
21.1%
t 898709
14.1%
o 455874
 
7.2%
r 454837
 
7.1%
e 454790
 
7.1%
c 454230
 
7.1%
y 453861
 
7.1%
n 453824
 
7.1%
p 446735
 
7.0%
g 444526
 
7.0%
Other values (13) 509215
 
8.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5915817
92.9%
Uppercase Letter 454945
 
7.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1344161
22.7%
t 898709
15.2%
o 455874
 
7.7%
r 454837
 
7.7%
e 454790
 
7.7%
c 454230
 
7.7%
y 453861
 
7.7%
n 453824
 
7.7%
p 446735
 
7.6%
g 444526
 
7.5%
Other values (8) 54270
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
A 444484
97.7%
C 10264
 
2.3%
M 150
 
< 0.1%
S 42
 
< 0.1%
E 5
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 6370762
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1344161
21.1%
t 898709
14.1%
o 455874
 
7.2%
r 454837
 
7.1%
e 454790
 
7.1%
c 454230
 
7.1%
y 453861
 
7.1%
n 453824
 
7.1%
p 446735
 
7.0%
g 444526
 
7.0%
Other values (13) 509215
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6370762
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1344161
21.1%
t 898709
14.1%
o 455874
 
7.2%
r 454837
 
7.1%
e 454790
 
7.1%
c 454230
 
7.1%
y 453861
 
7.1%
n 453824
 
7.1%
p 446735
 
7.0%
g 444526
 
7.0%
Other values (13) 509215
 
8.0%

order
Text

Distinct71
Distinct (%)< 0.1%
Missing450
Missing (%)0.1%
Memory size3.5 MiB
2025-01-14T11:51:30.940343image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length19
Mean length12.4550122
Min length9

Characters and Unicode

Total characters5664415
Distinct characters37
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowAnguilliformes
2nd rowPerciformes
3rd rowPerciformes
4th rowCypriniformes
5th rowPerciformes
ValueCountFrequency (%)
perciformes 213819
47.0%
cypriniformes 33749
 
7.4%
scorpaeniformes 17653
 
3.9%
characiformes 17476
 
3.8%
anguilliformes 17118
 
3.8%
siluriformes 14285
 
3.1%
myctophiformes 13710
 
3.0%
pleuronectiformes 12323
 
2.7%
stomiiformes 12086
 
2.7%
tetraodontiformes 10528
 
2.3%
Other values (61) 92043
20.2%
2025-01-14T11:51:31.257970image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 811837
14.3%
e 764648
13.5%
o 588951
10.4%
i 566889
10.0%
m 477158
8.4%
s 464239
8.2%
f 454790
8.0%
c 293642
 
5.2%
P 227486
 
4.0%
n 146406
 
2.6%
Other values (27) 868369
15.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5209625
92.0%
Uppercase Letter 454790
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 811837
15.6%
e 764648
14.7%
o 588951
11.3%
i 566889
10.9%
m 477158
9.2%
s 464239
8.9%
f 454790
8.7%
c 293642
 
5.6%
n 146406
 
2.8%
p 101446
 
1.9%
Other values (13) 539619
10.4%
Uppercase Letter
ValueCountFrequency (%)
P 227486
50.0%
C 72733
 
16.0%
S 57202
 
12.6%
A 29924
 
6.6%
B 17157
 
3.8%
M 14949
 
3.3%
T 10792
 
2.4%
G 9468
 
2.1%
O 7184
 
1.6%
L 3883
 
0.9%
Other values (4) 4012
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 5664415
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 811837
14.3%
e 764648
13.5%
o 588951
10.4%
i 566889
10.0%
m 477158
8.4%
s 464239
8.2%
f 454790
8.0%
c 293642
 
5.2%
P 227486
 
4.0%
n 146406
 
2.6%
Other values (27) 868369
15.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5664415
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 811837
14.3%
e 764648
13.5%
o 588951
10.4%
i 566889
10.0%
m 477158
8.4%
s 464239
8.2%
f 454790
8.0%
c 293642
 
5.2%
P 227486
 
4.0%
n 146406
 
2.6%
Other values (27) 868369
15.3%

family
Text

Distinct556
Distinct (%)0.1%
Missing903
Missing (%)0.2%
Memory size3.5 MiB
2025-01-14T11:51:31.434790image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length17
Mean length10.77570614
Min length6

Characters and Unicode

Total characters4895802
Distinct characters49
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)< 0.1%

Sample

1st rowMuraenidae
2nd rowMugilidae
3rd rowGobiidae
4th rowCyprinidae
5th rowCentropomidae
ValueCountFrequency (%)
cyprinidae 27640
 
6.1%
gobiidae 26015
 
5.7%
pomacentridae 16209
 
3.6%
labridae 14636
 
3.2%
blenniidae 14508
 
3.2%
myctophidae 13555
 
3.0%
apogonidae 12382
 
2.7%
serranidae 11411
 
2.5%
characidae 10639
 
2.3%
stomiidae 7664
 
1.7%
Other values (546) 299678
66.0%
2025-01-14T11:51:31.723025image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 709961
14.5%
e 649841
13.3%
i 645875
13.2%
d 488821
10.0%
o 277365
 
5.7%
r 275590
 
5.6%
n 252341
 
5.2%
t 211303
 
4.3%
c 159904
 
3.3%
h 140357
 
2.9%
Other values (39) 1084444
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4441465
90.7%
Uppercase Letter 454337
 
9.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 709961
16.0%
e 649841
14.6%
i 645875
14.5%
d 488821
11.0%
o 277365
 
6.2%
r 275590
 
6.2%
n 252341
 
5.7%
t 211303
 
4.8%
c 159904
 
3.6%
h 140357
 
3.2%
Other values (16) 630107
14.2%
Uppercase Letter
ValueCountFrequency (%)
C 101161
22.3%
S 65080
14.3%
P 51714
11.4%
M 37367
 
8.2%
A 35620
 
7.8%
G 34900
 
7.7%
L 29803
 
6.6%
B 27268
 
6.0%
H 16882
 
3.7%
E 14405
 
3.2%
Other values (13) 40137
 
8.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 4895802
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 709961
14.5%
e 649841
13.3%
i 645875
13.2%
d 488821
10.0%
o 277365
 
5.7%
r 275590
 
5.6%
n 252341
 
5.2%
t 211303
 
4.3%
c 159904
 
3.3%
h 140357
 
2.9%
Other values (39) 1084444
22.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4895802
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 709961
14.5%
e 649841
13.3%
i 645875
13.2%
d 488821
10.0%
o 277365
 
5.7%
r 275590
 
5.6%
n 252341
 
5.2%
t 211303
 
4.3%
c 159904
 
3.3%
h 140357
 
2.9%
Other values (39) 1084444
22.2%

genus
Text

Missing 

Distinct5469
Distinct (%)1.3%
Missing23303
Missing (%)5.1%
Memory size3.5 MiB
2025-01-14T11:51:31.899877image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length19
Mean length9.851295444
Min length2

Characters and Unicode

Total characters4255139
Distinct characters55
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique863 ?
Unique (%)0.2%

Sample

1st rowEchidna
2nd rowMugil
3rd rowCryptocentrus
4th rowRhinichthys
5th rowCentropomus
ValueCountFrequency (%)
notropis 7201
 
1.7%
etheostoma 4848
 
1.1%
gymnothorax 4324
 
1.0%
lepomis 4273
 
1.0%
chaetodon 4248
 
1.0%
lutjanus 3807
 
0.9%
halichoeres 3126
 
0.7%
chromis 3122
 
0.7%
pomacentrus 2956
 
0.7%
acanthurus 2892
 
0.7%
Other values (5462) 391269
90.6%
2025-01-14T11:51:32.141800image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 401438
 
9.4%
s 398934
 
9.4%
a 333701
 
7.8%
i 299750
 
7.0%
e 276218
 
6.5%
r 259728
 
6.1%
t 246556
 
5.8%
u 244842
 
5.8%
n 220470
 
5.2%
h 207702
 
4.9%
Other values (45) 1365800
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3823062
89.8%
Uppercase Letter 431937
 
10.2%
Space Separator 129
 
< 0.1%
Other Punctuation 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 401438
10.5%
s 398934
10.4%
a 333701
 
8.7%
i 299750
 
7.8%
e 276218
 
7.2%
r 259728
 
6.8%
t 246556
 
6.4%
u 244842
 
6.4%
n 220470
 
5.8%
h 207702
 
5.4%
Other values (16) 933723
24.4%
Uppercase Letter
ValueCountFrequency (%)
C 59175
13.7%
P 51463
11.9%
S 48630
11.3%
A 41611
9.6%
E 28279
 
6.5%
L 26639
 
6.2%
H 25142
 
5.8%
M 25085
 
5.8%
N 20912
 
4.8%
G 18808
 
4.4%
Other values (16) 86193
20.0%
Other Punctuation
ValueCountFrequency (%)
. 6
54.5%
? 5
45.5%
Space Separator
ValueCountFrequency (%)
129
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4254999
> 99.9%
Common 140
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 401438
 
9.4%
s 398934
 
9.4%
a 333701
 
7.8%
i 299750
 
7.0%
e 276218
 
6.5%
r 259728
 
6.1%
t 246556
 
5.8%
u 244842
 
5.8%
n 220470
 
5.2%
h 207702
 
4.9%
Other values (42) 1365660
32.1%
Common
ValueCountFrequency (%)
129
92.1%
. 6
 
4.3%
? 5
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4255139
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 401438
 
9.4%
s 398934
 
9.4%
a 333701
 
7.8%
i 299750
 
7.0%
e 276218
 
6.5%
r 259728
 
6.1%
t 246556
 
5.8%
u 244842
 
5.8%
n 220470
 
5.2%
h 207702
 
4.9%
Other values (45) 1365800
32.1%

subgenus
Text

Missing 

Distinct57
Distinct (%)19.9%
Missing454953
Missing (%)99.9%
Memory size3.5 MiB
2025-01-14T11:51:32.266735image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length13
Mean length9.700348432
Min length2

Characters and Unicode

Total characters2784
Distinct characters43
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)6.3%

Sample

1st rowEcsenius
2nd rowWatasea
3rd rowDistoechodon
4th rowEcsenius
5th rowArtediellops
ValueCountFrequency (%)
meiacanthus 47
16.4%
ecsenius 34
 
11.8%
dasson 31
 
10.8%
watasea 12
 
4.2%
allomeiacanthus 10
 
3.5%
hadropterus 9
 
3.1%
odontohypopomus 8
 
2.8%
musgravius 8
 
2.8%
ulocentra 7
 
2.4%
hyporhamphus 6
 
2.1%
Other values (47) 115
40.1%
2025-01-14T11:51:32.485871image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 329
11.8%
a 309
11.1%
o 228
 
8.2%
n 198
 
7.1%
e 192
 
6.9%
u 189
 
6.8%
i 185
 
6.6%
t 169
 
6.1%
c 135
 
4.8%
h 119
 
4.3%
Other values (33) 731
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2497
89.7%
Uppercase Letter 287
 
10.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 329
13.2%
a 309
12.4%
o 228
9.1%
n 198
7.9%
e 192
7.7%
u 189
7.6%
i 185
7.4%
t 169
 
6.8%
c 135
 
5.4%
h 119
 
4.8%
Other values (13) 444
17.8%
Uppercase Letter
ValueCountFrequency (%)
M 60
20.9%
D 43
15.0%
E 41
14.3%
A 33
11.5%
H 22
 
7.7%
W 12
 
4.2%
C 12
 
4.2%
O 10
 
3.5%
N 9
 
3.1%
P 9
 
3.1%
Other values (10) 36
12.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 2784
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 329
11.8%
a 309
11.1%
o 228
 
8.2%
n 198
 
7.1%
e 192
 
6.9%
u 189
 
6.8%
i 185
 
6.6%
t 169
 
6.1%
c 135
 
4.8%
h 119
 
4.3%
Other values (33) 731
26.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2784
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 329
11.8%
a 309
11.1%
o 228
 
8.2%
n 198
 
7.1%
e 192
 
6.9%
u 189
 
6.8%
i 185
 
6.6%
t 169
 
6.1%
c 135
 
4.8%
h 119
 
4.3%
Other values (33) 731
26.3%

specificEpithet
Text

Missing 

Distinct13201
Distinct (%)3.4%
Missing67324
Missing (%)14.8%
Memory size3.5 MiB
2025-01-14T11:51:32.706701image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length24
Mean length8.888016993
Min length2

Characters and Unicode

Total characters3447804
Distinct characters43
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3137 ?
Unique (%)0.8%

Sample

1st rownebulosa
2nd rowfilifer
3rd rowcataractae
4th rowensiferus
5th rowinferomaculata
ValueCountFrequency (%)
maculatus 1820
 
0.5%
fasciatus 1655
 
0.4%
lineatus 1591
 
0.4%
punctatus 1556
 
0.4%
affinis 1525
 
0.4%
nigricans 1446
 
0.4%
ocellatus 1437
 
0.4%
cornutus 1264
 
0.3%
notatus 1168
 
0.3%
niger 1161
 
0.3%
Other values (13175) 373462
96.2%
2025-01-14T11:51:32.991983image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 387982
11.3%
s 360537
10.5%
i 359278
10.4%
u 278315
 
8.1%
e 243128
 
7.1%
r 229222
 
6.6%
t 216551
 
6.3%
n 208679
 
6.1%
o 195702
 
5.7%
l 194112
 
5.6%
Other values (33) 774298
22.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3447299
> 99.9%
Space Separator 169
 
< 0.1%
Other Punctuation 133
 
< 0.1%
Dash Punctuation 107
 
< 0.1%
Decimal Number 86
 
< 0.1%
Open Punctuation 5
 
< 0.1%
Close Punctuation 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 387982
11.3%
s 360537
10.5%
i 359278
10.4%
u 278315
 
8.1%
e 243128
 
7.1%
r 229222
 
6.6%
t 216551
 
6.3%
n 208679
 
6.1%
o 195702
 
5.7%
l 194112
 
5.6%
Other values (16) 773793
22.4%
Decimal Number
ValueCountFrequency (%)
1 37
43.0%
2 24
27.9%
3 14
 
16.3%
4 3
 
3.5%
6 3
 
3.5%
5 2
 
2.3%
9 2
 
2.3%
7 1
 
1.2%
Other Punctuation
ValueCountFrequency (%)
. 78
58.6%
/ 41
30.8%
? 6
 
4.5%
5
 
3.8%
# 3
 
2.3%
Space Separator
ValueCountFrequency (%)
169
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 107
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3447299
> 99.9%
Common 505
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 387982
11.3%
s 360537
10.5%
i 359278
10.4%
u 278315
 
8.1%
e 243128
 
7.1%
r 229222
 
6.6%
t 216551
 
6.3%
n 208679
 
6.1%
o 195702
 
5.7%
l 194112
 
5.6%
Other values (16) 773793
22.4%
Common
ValueCountFrequency (%)
169
33.5%
- 107
21.2%
. 78
15.4%
/ 41
 
8.1%
1 37
 
7.3%
2 24
 
4.8%
3 14
 
2.8%
? 6
 
1.2%
5
 
1.0%
( 5
 
1.0%
Other values (7) 19
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3447799
> 99.9%
Punctuation 5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 387982
11.3%
s 360537
10.5%
i 359278
10.4%
u 278315
 
8.1%
e 243128
 
7.1%
r 229222
 
6.6%
t 216551
 
6.3%
n 208679
 
6.1%
o 195702
 
5.7%
l 194112
 
5.6%
Other values (32) 774293
22.5%
Punctuation
ValueCountFrequency (%)
5
100.0%

infraspecificEpithet
Text

Missing 

Distinct912
Distinct (%)9.2%
Missing445321
Missing (%)97.8%
Memory size3.5 MiB
2025-01-14T11:51:33.153363image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length16
Mean length8.896461337
Min length2

Characters and Unicode

Total characters88244
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique304 ?
Unique (%)3.1%

Sample

1st rowniloticus
2nd rowramosus
3rd rowvexillare
4th rowvermiculatus
5th rowsciera
ValueCountFrequency (%)
leptocephalus 303
 
3.1%
atromaculatus 225
 
2.3%
crocodilus 222
 
2.2%
atratulus 169
 
1.7%
vermiculatus 156
 
1.6%
ferox 145
 
1.5%
commersoni 138
 
1.4%
purpurescens 135
 
1.4%
salmoides 124
 
1.2%
interocularis 121
 
1.2%
Other values (903) 8184
82.5%
2025-01-14T11:51:33.426736image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 9790
11.1%
a 9335
10.6%
i 7995
9.1%
u 7974
9.0%
e 6182
 
7.0%
r 6077
 
6.9%
o 5911
 
6.7%
l 5742
 
6.5%
c 5258
 
6.0%
t 5108
 
5.8%
Other values (21) 18872
21.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 88235
> 99.9%
Dash Punctuation 3
 
< 0.1%
Space Separator 3
 
< 0.1%
Decimal Number 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 9790
11.1%
a 9335
10.6%
i 7995
9.1%
u 7974
9.0%
e 6182
 
7.0%
r 6077
 
6.9%
o 5911
 
6.7%
l 5742
 
6.5%
c 5258
 
6.0%
t 5108
 
5.8%
Other values (16) 18863
21.4%
Decimal Number
ValueCountFrequency (%)
1 1
33.3%
3 1
33.3%
2 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 88235
> 99.9%
Common 9
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 9790
11.1%
a 9335
10.6%
i 7995
9.1%
u 7974
9.0%
e 6182
 
7.0%
r 6077
 
6.9%
o 5911
 
6.7%
l 5742
 
6.5%
c 5258
 
6.0%
t 5108
 
5.8%
Other values (16) 18863
21.4%
Common
ValueCountFrequency (%)
- 3
33.3%
3
33.3%
1 1
 
11.1%
3 1
 
11.1%
2 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 88244
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 9790
11.1%
a 9335
10.6%
i 7995
9.1%
u 7974
9.0%
e 6182
 
7.0%
r 6077
 
6.9%
o 5911
 
6.7%
l 5742
 
6.5%
c 5258
 
6.0%
t 5108
 
5.8%
Other values (21) 18872
21.4%

taxonRank
Text

Constant  Missing 

Distinct1
Distinct (%)< 0.1%
Missing445321
Missing (%)97.8%
Memory size3.5 MiB
2025-01-14T11:51:33.498802image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters99190
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsubspecies
2nd rowsubspecies
3rd rowsubspecies
4th rowsubspecies
5th rowsubspecies
ValueCountFrequency (%)
subspecies 9919
100.0%
2025-01-14T11:51:33.626191image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 29757
30.0%
e 19838
20.0%
u 9919
 
10.0%
b 9919
 
10.0%
p 9919
 
10.0%
c 9919
 
10.0%
i 9919
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 99190
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 29757
30.0%
e 19838
20.0%
u 9919
 
10.0%
b 9919
 
10.0%
p 9919
 
10.0%
c 9919
 
10.0%
i 9919
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 99190
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 29757
30.0%
e 19838
20.0%
u 9919
 
10.0%
b 9919
 
10.0%
p 9919
 
10.0%
c 9919
 
10.0%
i 9919
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 99190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 29757
30.0%
e 19838
20.0%
u 9919
 
10.0%
b 9919
 
10.0%
p 9919
 
10.0%
c 9919
 
10.0%
i 9919
 
10.0%
Distinct1552
Distinct (%)4.0%
Missing416165
Missing (%)91.4%
Memory size3.5 MiB
2025-01-14T11:51:33.834798image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length38
Median length29
Mean length11.0434293
Min length2

Characters and Unicode

Total characters431522
Distinct characters60
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique385 ?
Unique (%)1.0%

Sample

1st rowVari et al.
2nd rowLarson
3rd rowEvermann & Goldsborough
4th rowRandall & Swerdloff
5th rowGirard
ValueCountFrequency (%)
16117
 
21.1%
randall 3014
 
3.9%
jordan 2700
 
3.5%
gilbert 2282
 
3.0%
et 2241
 
2.9%
al 2241
 
2.9%
schultz 2140
 
2.8%
bean 1674
 
2.2%
springer 1556
 
2.0%
gill 1196
 
1.6%
Other values (979) 41206
54.0%
2025-01-14T11:51:34.098646image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 39668
 
9.2%
37292
 
8.6%
a 36842
 
8.5%
l 31118
 
7.2%
r 30515
 
7.1%
n 29998
 
7.0%
i 21801
 
5.1%
o 19194
 
4.4%
t 17436
 
4.0%
& 16117
 
3.7%
Other values (50) 151541
35.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 317872
73.7%
Uppercase Letter 56790
 
13.2%
Space Separator 37292
 
8.6%
Other Punctuation 18513
 
4.3%
Dash Punctuation 723
 
0.2%
Open Punctuation 166
 
< 0.1%
Close Punctuation 166
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 39668
12.5%
a 36842
11.6%
l 31118
9.8%
r 30515
9.6%
n 29998
9.4%
i 21801
 
6.9%
o 19194
 
6.0%
t 17436
 
5.5%
d 14154
 
4.5%
s 11782
 
3.7%
Other values (18) 65364
20.6%
Uppercase Letter
ValueCountFrequency (%)
S 8394
14.8%
G 7388
13.0%
B 5312
9.4%
R 5180
9.1%
J 3635
 
6.4%
L 3179
 
5.6%
H 3091
 
5.4%
M 3086
 
5.4%
C 2777
 
4.9%
W 2491
 
4.4%
Other values (15) 12257
21.6%
Other Punctuation
ValueCountFrequency (%)
& 16117
87.1%
. 2375
 
12.8%
' 21
 
0.1%
Space Separator
ValueCountFrequency (%)
37292
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 723
100.0%
Open Punctuation
ValueCountFrequency (%)
( 166
100.0%
Close Punctuation
ValueCountFrequency (%)
) 166
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 374662
86.8%
Common 56860
 
13.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 39668
 
10.6%
a 36842
 
9.8%
l 31118
 
8.3%
r 30515
 
8.1%
n 29998
 
8.0%
i 21801
 
5.8%
o 19194
 
5.1%
t 17436
 
4.7%
d 14154
 
3.8%
s 11782
 
3.1%
Other values (43) 122154
32.6%
Common
ValueCountFrequency (%)
37292
65.6%
& 16117
28.3%
. 2375
 
4.2%
- 723
 
1.3%
( 166
 
0.3%
) 166
 
0.3%
' 21
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 431486
> 99.9%
None 36
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 39668
 
9.2%
37292
 
8.6%
a 36842
 
8.5%
l 31118
 
7.2%
r 30515
 
7.1%
n 29998
 
7.0%
i 21801
 
5.1%
o 19194
 
4.4%
t 17436
 
4.0%
& 16117
 
3.7%
Other values (48) 151505
35.1%
None
ValueCountFrequency (%)
ü 30
83.3%
á 6
 
16.7%